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In one embodiment, the nucleotide and polypeptide sequences of the present invention may 
be used to design selective CA inhibitors. Studies have also shown that the different alpha-CA have 
different inhibitor binding properties (Sly et al., (1995), supra), suggesting that it may be possible to 
provide compounds that inhibit a CA isozyme of interest, such as CA II, while not binding to or 
5 inhibiting related enzymes such as the polypeptide of SEQ ID No. 390, The nucleic acid and 
polypeptide sequences of the invention can be used in computer based drug design or for carrying 
out binding predictions with candidate CA inhibitors in view of the extensive structural information 
publicly available for CA enzymes. In preferred embodiments, the nucleic acid and polypeptide of 
the invention is used in drug screening assays, including both cell based and non cell based assays. 
10 In one embodiment, a nucleotide or polypeptide sequence of the invention is brought into contact 
with a candidate CA inhibitor (such as a CA II inhibitor), and binding of the candidate inhibitor to 
the polypeptide of the invention, or the activity of the polypeptide of the invention is detected. 
Activity of the polypeptide of the invention may be CA activity, or any other suitable activity 
possessed by the polypeptide of the invention which may be inhibited by binding of the candidate 

15 substance. In preferred embodiments, a panel of CA isozymes including the polypeptide of the 
invention are screened against the candidate substance, including the polypeptide of SEQ ID No 39 
and one or more enzymes selected from the group consisting of CA I, CA III, CA IV, CA VI, a 
CARP including but not limited to CARP VII, CARP X, CARP XL In preferred embodiments, a 
candidate CA inhibitor is screened against one or more non-catalytic CA related proteins to 

20 eliminate undesired inhibition of these enzymes which may be involved in other important 

physiological functions. Means to conduct such drug screening assays are well known in the art. 

Increasing alpha-CA activity for the treatment of alpha-CA deficiency disease 

The polypeptide of the invention may also be used as a source of CA activity, such as for 

the treatment of disease. The defects in carbonic anhydrases are the cause of several diseases, 
25 including osteopetrosis (abnormally dense bone) renal tubular acidosis, cerebral calcification and 
mental retardation. Also, a carbonic anhydrase-related protein is described as being linked to cone- 
rod retinal distrophy (Bellinghan et al., 1998, Biochem. Biophys. Res. Comm.: 253, 364-367). 

In one aspect, the invention thus involves increasing CA activity by providing increased 
activity of the polypeptide of SEQ ID No. 390. Increased activity of the polypeptide of SEQ ID No 
30 390 can be provided by any suitable means, as further descnber herein. Activity may be provided 
for example by introducing to a host cell or patient a vector containing a nucleotide sequence of 
SEQ ID No 149, treating said cell with a compound capable of increasing the expression of the 
polypeptide of the invention and/or treating a cell or patient directly with a polypeptide of SEQ ID 
No 390. In preferred embodiments, the polypeptide of the invention comprises at least one amino 
35 acid substitution, deletion or insertion. In one aspect, such amino acid changes are preferably in the 
catalytic site; preferably said amino acid changes involve the substitution, deletion or insertion of a 
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His residue and preferably said amino acid changes increase the CO2 hydration activity of the 
polypeptide of the invention. 
Metal ion biosensors 

In further aspects, metal ion biosensors can be designed based on the polypeptide of SEQ 
5 ID No 390. Determination of metal ion concentrations in complex media such as serum, cell 
cytoplasm as well as for example seawater are important analytical functions that require high 
degrees of sensitivity and selectivity. 

Biosensors may be particularly useful in detecting metal ion fluxes in and between cells. 
Such biosensors may exploit metal-binding ability of the polypeptide of the invention, as described 
10 by Thompson et aL, who have developed such biosensors based on the CA enzyme (CA II). Such 
biosensors are useful in the detection of metal ion flux for example in the central nervous system. 
Zinc-containing neurons found throughout the mammalian cerebral cortex, striatum and amygdalar 
nuclei have been shown to release their zinc in a depolarization- and calcium-dependent fashion in 
vitro and in vivo. This zinc release has been suggested to act as a trans-synaptic neuromodulator : 
1 5 which has in turn been linked to excitotoxic neuronal cell death. CA based biosensors developed by 
Thomspon et al. showed that zinc is present and can be detected in extracellular medium from 
neurons, (Thompson etal, J. Neurosci Methods 96:35-45 (2000)). 

Biosensors based on CA have been shown to be extremely selective, detecting Cu at 
subpicomolar levels, which is of sensitivity that might be achieved with mass spectometric 
20 techniques. Sensors based on the CA II isozyme have been shown to detect Zn and Cu at picomolar 
levels, and Cd, Co and Ni at nanomolar levels, (Thompson et ah, Anal Biochem. 267:185-195 

(1999) ). CA based biosensors have also demonstrated selectivity over potential interferents in 
biological systems at mM levels in extracellular fluids, such as Mg and Ca. (Thompson et al. 

(2000) , supra). 

25 Biosensors based on the polypeptide of the invention are based on the high selectivity and 

sensitivity of CA isozymes for zinc. Because the binding of Zn in the active site of the enzyme 
affects the enzyme's ability to bind a CA inhibitor, it is possible to use a CA inhibitor that exhibits a 
detectable change upon binding to the polypeptide of the invention to detect the fraction of 
polypeptide bound to the inhibitor, and therefore bound to Zn. The fraction of polypeptide with 

30 bound Zn in turn is determined by the concentrations of free Zn and the polypeptide of the 
invention, and the dissociation constant for zinc. 

In one example, binding of the CA inhibitor to the polypeptide of the invention is detected 
by using a fluorescent inhibitor, whereby the inhibitor shows a detectable change in fluorescence 
emission wavelength of polarization upon binding to the polypeptide of the invention. In one 

35 example, a fluorescent sulfonamide is used, such as the fluorophore ABD-N (Thompson et al. 
(2000). supra). 

Engineered CA enzymes 
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CA isozymes have been shown to have differing levels of catalytic activity and efficiency. 
In preferred embodiments, particularly for treatments which involve providing the increased activity 
of the polypeptide of SEQ ID No 390 or for use in metal ion biosensors, the polypeptide of the 
invention may be modified for increased CO2 hydration and/or zinc binding. 
5 In particular, studies have been carried out characterizing residues important for maximal 

CA activity, allowing CA isozymes to be designed having desired levels of activity. Important 
structural elements in CA isozymes for zinc binding, CO2 hydration activity and stability are 
reviewed in Lmdskog, Pharmacol. Ther. 74(1): 1-20 (1997) and Sly (1995), supra. In one example, 
studies of CA III showed that changing the Phel98 residue to a Leul98 residue (as in CAII) 

10 resulted in a 25 fold increase in activity. (Chen et al., (1993), supra). Catalysis has also been greatly 
increased in CA II by replacing the Thr200 residue with His, as is normally found in CA I enzymes. 
Most dramatically, a CA-related protein (CA-RP) which in its native form was missing important 
residues at the catalytic site and had no detectable C02 hydration activity at all was rendered an 
active CA by only two point mutations. (Sjoblom et al., FEBS Lett. 398; 322-325(1996)). 

1 5 Thus, in embodiments where the polypeptide of the invention is used to provide a source of 

CO2 hydration or for its zinc binding properties, it is advantageous to modify the polypeptide of the 
invention by introducing at least one amino acid substitution, deletion or insertion. In one aspect, 
such amino acid changes are preferably in the catalytic site; preferably said amino acid changes 
involve the substitution, deletion or insertion of a His residue and preferably said amino acid 

20 changes increase the C02 hydration activity of the polypeptide of the invention. Optimal amino 
acid changes can be determined by the skilled artisan, particularly in view of sequence comparisons 
which can be earned out with the many well -characterized CA isozymes. 

Protein of SEP ID NQ:252 (internal designation 105-089-3-0-G10-CS) 

The protein of SEQ ID NO:252 is encoded by the cDNA of SEQ ID NO: 11 . Accordingly, 
25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:252 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 105-089-3-0-G10-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ED NO: 1 1 described throughout the present application also pertain 
to the human cDNA of clone 105-089-3-0-GI0-CS. It is over represented in fetal brain. 
30 The protein of SEQ ID NO.252 encoded by the cDNA of SEQ ID NO: 1 1 is distributed 

primarily in the prostate and salivary gland. The protein of SEQ ID NO:252 is homologous to 
sequences described in PCT publication WO9827205-A2 (which describes a protein that was 
isolated from a human adult salivary gland cDNA library), PCT publication W09839446-A2, PCT 
publication W09839446-A2. The disclosures of each of the preceding PCT publications is 
35 incorporated herein by reference in their entireties. 
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The protein of SEQ ID NO:252 is also homologous to a polypeptide described in PCT 
publication W09835229-A1, the disclosure of which is incorporated herein by reference in its 
entirety. Wo9835229-Al describes a peptide of 27 amino acid residues that corresponds to 23/27 of 
a portion of the protein of SEQ ID NO:252 (amino acid 20-46). This corresponds to 85% identity 
5 with conserved changes (3 out of 4) yielding a 96% homology. 

The protein described in WO 9835229 was identified in reflex tears that were collected 
from 12 non-contact lens wearing male and female humans. Reflex tears were stimulated by gently 
rubbing the nasal mucosa with a cotton wool tipped bud. Two different batches were collected 
from two different groups and examined by analytical and preparative 2-dimensional 

10 electrophoresis. After separation in the second dimension and transfer to PVDF membranes, 

identified protein spots (by 0.1% (w/v) Coomassie Blue) were loaded into a membrane-compatible 
Hewlett -Packard cartridge. Sequencing was conducted with a Model G 1005 A (Hewlett-Packard, 
CA) sequenator. One of the proteins identified migrated at 25 kDa and was revealed to have 5 
isoforms of different pi. Two of these were N-terminally sequenced and gave the sequence of the 

1 5 above peptide with a pi of 5.0 and 4.4. The different isoforms indicate that this protein undergoes 
post-translational modifications, including sialylation or acylation. The presence of these isoforms 
in different degrees could reflect the disease status of the individual. Accordingly, one embodiment 
of the present invention relates to the detection or diagnosis of disease by determining the activity 
or level of the protein of SEQ ID NO:252 or a polynucleotide encoding the protein of SEQ ID 

20 NO:252 in an individual. For example, detection of the secreted protein of SEQ ID NO:252 in an 
individual may be accomplished non-invasively by measuring protein levels in bodily fluids into 
which the protein is secreted, such as tears and saliva. Such methods may be empolyed both in 
humans and in animals. It is probable that after the signal peptide is cleaved, the protein of SEQ ID 
NO:252 is secreted into bodily fluids including tears and probably saliva. 

25 The protein of SEQ ID NO;252 can also be used for the screening of non-ocular diseases, 

by analyzing tears for marker proteins, particularly indicative of cancer and genetic disease. In 
addition, an altered chromatographic profile (e.g. 2D gel) of the isoforms of the protein of SEQ ID 
NO:252 may also indicate the disease state of an individual. For example, the levels of marker 
proteins in relation to the protein of SEQ ID NO; 25 2 may be determined to evaluate whether the 

30 individual is suffering from a disease. Alternatively, tears may be analyzed for the levels of 
different isoforms of the protein of SEQ ID NO: 25 2 to determine whether the pattern of such 
isoforms is indicative of disease. 

The protein of SEQ ID NO:252 or fragments thereof may also be used as a lubncant or 
cleansing agent for the eyes. This protein can be included in contact lenses washing and storage 

35 solutions. This protein can also be useful as an ingredient in eye washing solutions (e.g. eye drops) 
used for everyday redness or healing after surgical/laser intervention. For example, the protein may 
be used to reduce eye inflammation. Alternatively, anti-bactenal properties may be exploited by 
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including the protein of SEQ ID NO:252 or fragments thereof in solutions, creams or ointments for 
the eyes, as well as creams or ointments in general for external applications. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:252, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
5 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

ameliorate a condition in an individual. In such embodiments, the protein of SEQ ED NO:252, or a 
fragment thereof, is administered to an individual in whom it is desired to increase or decrease any 
of the activities of the protein of SEQ ID NO;252. The protein of SEQ ID NO:252 or fragment 
thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 

10 protein of SEQ ID NO;252 or a fragment thereof may be administered to the individual. 

Alternatively, an agent which increases the activity of the protein of SEQ ED NO: 25 2 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 

1 5 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:252 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:252 may be identified by contacting the protein of 
SEQ ID NO: 252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent 

20 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

25 example, saliva or tears, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:252 in the sample. For example, the protein of SEQ 
ID NO:252 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 

30 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from saliva or tears or tissues other than saliva or tears to determine whether the test sample is from 

35 saliva or tears. Alternatively, the level of the protein of SEQ ID NO:252 in a test sample may be 
measured by determining the level of RNA encoding the protein of SEQ ID NO:252 in the test 
sample. RNA levels may be measured using nucleic acid arrays or using techniques such as in situ 
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hybridization, Northern blots, dot blots or other technques familiar to those skilled in the art. If 
desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic acid 
sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in control 
cells from saliva or tears or tissues other than saliva or tears to determine whether the test sample is 
5 from saliva or tears. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:252, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:252 or a fragment thereof may be fixed to a solid support, such as a 

1 0 chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:252 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:252 or a 

1 5 fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:252. In such techniques, the level of the protein of SEQ ID NO:252 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of 252 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ID NO:252 which is indicative of disease. 

20 Protein of SEQ ID NQ:308 (internal designation 187-41 -0-0-121 -CS) 

The protein of SEQ ID NO:308 is encoded by the cDNA of SEQ ID NO:67. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:308 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 18741-0-0-i21-CS. In addition, it will be appreciated that all characteristics and 

25 uses of the nucleic acid of SEQ ID NO; 67 described throughout the present application also pertain 
to the human cDNA of clone 1 87-41 -0-0- i21-CS. 

The protein of SEQ ID NO: 30 8 is highly homologous to human secreted protein nfB7_l 
from PCT publication WO 9935252-A2 (the disclsoure of which is incorporated herein by reference 
in its entirety), to amino acids 26-129 of the human secreted protein SEQ ID N0441 from PCT 

30 publication WO 9906548-A2 (the disclosure of which is incorporated herein by reference in its 
entirety), and to amino acids 26-1 14 of human secreted protein SEQ ID NO:439 from PCT 
publication WO 9906548-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the protein of the invention appears to be a polymorphic variant of nf87__l. Since 
most of the proteins with high homology to the sequence of the invention have longer 5 'termini, it 

35 is conceivable that the protein of the invention is a truncated/spliced variant of these proteins. 
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The protein of SEQ ID NO:308 was identified among the cDNAs from a library constructed 
from brain. Tissue distribution analysis through a BLAST analysis of databases shows that mRNA 
encoding this protein was found primarily in kidney, liver, and cancerous prostate. 

The protein of SEQ ID NO:308 has chemical and structural homology to human interferon- 
5 inducible (IFI) protein isoforms p27 (63%), HIFI (50% identity), and to interferon-induced protein 
6-16 precursor (IFI-6-16, 36%). Furthermore, the protein of the invention has structural homology 
(40% identity) to the human erythropoietin (EPO) primary response gene, EPRG3pt from PCT 
publication WO 9906063-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the present invention relates to nucleic acid and amino acid sequences of a novel IFI 
10 protein and to the use of these sequences in the diagnosis, study, prevention and treatment of 
disease. 

The protein of SEQ ID NO:308 comprises 105 amino acids. From the amino acid 
alignments and the hydrophobic ity plots, it has a predicted signal peptide sequence spanning 
residues 31-43 and two predicted transmembrane domains spanning residues 17-37, and 48-68. 
1 5 Accordingly, one embodiment of the present invention is a polypeptide comprising the signal 
peptide and/or one or more of the transmembrane doamins. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 
cytokines. ot-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
product of a single gene, a- and p-IFNs are also known as type I IFNs. Type I IFNs are produced in 
20 a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
and by various cytokines and growth factors. y-IFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 
25 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, P- and y-IFNs. Other IFI 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 
30 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a -IFN, and renders the cancer cells more susceptible 
to immune rejection. The IFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 
35 New York NY pp 121 1-1214, the disclosure of which is incorporated herein by reference in its 
entirety). Type II IFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 

306 



WO 01/42451 PCT/IBOtt/01938 

The IFI gene known as 6-16 encodes an mRNA, which is highly induced by type I LFNs in a 
variety of human cells (Kelly JM et al (1986) EMBO J 5:1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much 
as 0. 1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the 
5 absence of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a 
hydrophobic protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal 
peptide. Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively 
charged C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 

10 The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
breast tumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 
expression occurs only upon a-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 

15 analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
tumors may produce high levels of, or have increased sensitivity to, type I IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 

20 in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IFI gene products may contribute to viral resistance. A hepatitis-C 
virus (HCV)-induced gene, 130-51, was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 

25 the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of the preceding paper suggest that 
HCV infection actively induces IFN expression, which in turn induces expression of IF I genes 
including 130-5 1 . The IFI proteins synthesized in response to viral infections are known to inhibit 
viral functions such as penetration, uncoatmg, RNA or protein synthesis, assembly or release. The 

30 130-5 1 protein may inhibit one or more of these functions in HCV. A particular virus may be 

inhibited in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by 
IFI proteins differs among the virus families (Hardman JG, supra, p 121 1, the disclosure of which is 
incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 

35 incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LIFESEQ™ 
database (Incyte Pharmaceutic as, Palo Alto, CA) shows that HIFI mRNA was found only in 
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neonatal kidney. The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-51, respectively. 

Based on the chemical and structural homology between the protein of SEQ ID NO;308 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
5 SEQ ID NO: 308 is synthesized when interferons are produced in infections, inflammation, 
autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:308 or fragments thereof may be used in diagnosis and treatment of diseases 
such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 

10 systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 

15 sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e.g. leishmania). 
20 The protein of SEQ ID NO: 308 or fragments thereof may also be used to treat conditions 

associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

Another embodiment of the present invention relates to the use of the protein of SEQ ID 
NO:308 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 

25 pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 

embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin Hpopoly saccharide (LPS). The methods for using such compositions is described in 
Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

30 Furthermore, the protein of SEQ ID NO;308 or fragments thereof are useful as a reagent for 

analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:308 or fragments thereof may be used to identify 
specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 

35 SEQ ID NO:308 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL10 and tumor antigens, which may interact with the protein of 
the invention. 
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The protein of SEQ ID NO:308 or fragments thereof may also be included in 
pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO 308 or fragments thereof is used to inhibit and/or modulate the 
5 effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 
preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 
10 either intra peritoneally intravenously, subcutaneously or directly in the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnost.c assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 
15 therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 
treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 
20 modify gene expression in tumor and pathogen-infected cells and to influence expression of 

cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 
to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 
25 upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 
and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 
DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 
30 region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 
as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO:308 or fragments thereof are useful for the 
diagnosis of cond.tions and diseases associated with its expression and to quantify the protein of the 
35 invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, ch.meric, single chain, 
Fab fragments produced by a Fab expression library. Neutralizing antibod.es are especially 
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preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 
5 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme -linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
10 immunoprecipitation, and chromatography. 

Under conditions of significant blood loss, EPO therapy, or both, iron-restricted 
erythropoiesis is evident. However, intravenous or oral iron therapy has substantial drawbacks. 
Moreover, traditional biochemical markers of storage iron in patients with anemia of chronic 
disease are unhelpful in the assessment of iron status (Lawrence T et al (2000) Blood 96:823-833, 
1 5 the disclosure of which is incorporated herein by reference in its entirety). As the protein of SEQ 
ID NO:308 bears homology to the human erythropoietin (EPO) primary response gene, EPRG3pt, it 
may be used to promote red blood cell formation or to monitor the value of safer intravenous iron 
preparations in patients with blood loss anemia, particularly those undergoing EPO therapy. 

The hydrophobic IFI protein of SEQ ID NO: 308 or fragments thereof may be used to 
20 diagnose conditions associated with its induction. For example, the protein of SEQ ID NO:308 or 
fragments thereof may be useful in the diagnosis and treatment of tumors, viral infections, 
inflammation, or conditions associated with impaired immunity, anemia of chronic blood loss or 
chronic disease, hemochromatosis, and EPO therapy. Furthermore, this protein may be used for 
investigating the control of gene expression by LFNs and other cytokines, as well as hormones and 
25 growth factors, in normal and diseased cells. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation, anemia, iron-overload and tumor models, by 
injecting the protein either intra peritoneally intravenously, subcutaneously or directly in the 
diseased tissue. 

30 In addition, the protein of SEQ ID NO:308 is structurally related to other proteins having 

homology and/or structural similarity with human p27 (Rasmussen, U.B., et al., 1993, Cancer 
Research 53:4096-4101, the disclosure of which is incorporated herein by reference). Accordingly, 
the protein of brain, fetal brain, kidney, fetal kidney, or colon may be used to regulate the 
proliferation of EPO-dependent cells or the growth and development of erythroid and other 

35 hematopoietic lineages. 

The protein of SEQ ID NO:308 or fragments thereof, or polynucleotides encoding the 
protein of SEQ ID NO:308 or fragments thereof, may be used to treat or ameliorate anemia of 
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chronic disease and chronic renal failure, polycythemia, cancer, AIDS, drug- and phlebotomy- 
induced anemias, hemochromatosis, erythropoiesis mediated by EPO therapy, and other conditions 
associated with altered activity or levels of the protein of SEQ ID NO:308. 

In another embodiment, the present invention relates to methods for identifying agonists 
5 and antagonists/inhibitors using the protein of SEQ ID NO:308 or fragments thereof, and treating 
conditions with the identified compounds. In a still further aspect, the invention relates to 
diagnostic assays for detecting diseases associated with inappropriate levels or activity of the 
protein of SEQ ID NO:308. In still another embodiment of the invention relates to the use of the 
protein SEQ ID NO:308, fragments therof or the DNA encoding the protein of SEQ ID NO:308 or 

10 fragments thereof to monitor the value of iron therapy in patients undergoing EPO therapy, or 
experiencing blood loss, or both. 

The DNA encoding the protein of SEQ ID NO: 308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with abnormal expression of the protein of SEQ 
ID NO: 308. The diagnostic assay is useful to distinguish between absence, presence, and excess 

1 5 expression of the protein of the invention and to monitor regulation of levels of the protein of the 
invention during therapeutic intervention, The DNA may also be incorporated into effective 
eukaryotic expression vectors and directly targeted to a specific tissue, organ, or cell population for 
use in gene therapy to treat the above mentioned conditions, including tumors and/or to correct 
disease- or genetic-induced defects in any of the above mentioned proteins including the protein of 

20 the invention. The DNA may also be used to design antisense sequences and ribozymes, which can 
be administered to modify gene expression in tumor and pathogen-infected cells and to influence 
expression of cytokines, hormones and growth factors. In vivo delivery of genetic constructs into 
subjects can be developed to the point of targeting specific cell types, such as tumor where 
expression of the protein of the invention may be affected or is modulating the expression and/or 

25 activity of other proteins such as cytokines, growth factors, their receptors and/or rumor antigens. It 
is also useful to detect unknown upstream sequences (e. g. promoters and regulatory elements) by 
standard techniques and for research into the control of gene expression by interferons and other 
cytokines, as well as growth and transcription factors in normal and diseased cells. Hybridization 
probes are useful to detect DNA encoding the protein of the invention (or closely related molecules) 

30 in biological samples, and for mapping the naturally occurring genomic sequence to a particular 
chromosome/chromosome region. The DNA may be used to generate and/or treat in vivo animal 
models of disease, including susceptibility or resistance to infection, tumors, autoimmune 
conditions, anemia and iron-overload, as well as tumor therapy, based on vaccine, knock-out and 
transgene technologies. 

35 Antibodies against the protein of SEQ ID NO;308 are useful for the diagnosis of conditions 

and disease associated with its expression and to quantify the protein of the invention (e. g. in 
assays to monitor patients during therapeutic intervention). Antibodies specific for the protein may 
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include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab fragments 
produced by a Fab expression library. Neutralizing antibodies are especially preferred for 
diagnostics and therapeutics. Diagnostic assays for the protein of SEQ ID NO:308 include methods 
utilizing the antibody and a label to detect the protein of the invention in human body fluids or 
5 extracts of cells or tissues. 

The protein of SEQ ID NO:308 and its catalytic or immunogenic fragments or oligopeptides 
thereof, can be used for screening therapeutic compounds in any variety of drug screening 
techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
0 (PCR), RT-PCR, RNAse protection, Northern blotting, enzyme-linked immunosorbent asay 
(ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO: 308, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 
granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 
syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hod gkins), sarcomas, melanomas, adenomas, carcinomas of solid 
tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 
genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular IICV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

In such embodiments, the protein of SEQ ID NO: 308 , or a fragment thereof, is 
administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:308, The protein of SEQ ID NO:3Q8 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:308 or a fragment thereof may be administered to the individual, Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO: 308 may be administered to the 
individual. Such agents may be identified by contacting the protein of SEQ ID NO:308 or a cell or 
preparation containing the protein of SEQ ID NO:308 with a test agent and assaying whether the 
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test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:308 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:308 may be identified by contacting the protein of 
SEQ ID NO:308 or a cell or preparation containing the protein of SEQ ID NO:308 with a test 
agent and assaying whether the test agent decreases the activity of the protein. For example, the 
agent may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as 
an antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, kidney, liver, or cancerous prostate, or to distinguish between two or more possible 
sources of a sample on the basis of the level of the protein of SEQ ID NO.308 in the sample. For 
example, the protein of SEQ ID NO:308 or fragments thereof may be used to generate antibodies 

15 using any techniques known to those skilled in the art, including those described therein. Such 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated rumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. In such methods a sample is 
contacted with the antibody, which may be detectably labeled, under conditions which facilitate 

20 antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells frombrain, kidney, liver, or cancerous prostate or tissues other than 
brain, kidney, liver, or cancerous prostate to determine whether the test sample is from brain, 
kidney, liver, or cancerous prostate. Alternatively, the level of the protein of SEQ ID NO;308 in a 
test sample may be measured by determining the level of RNA encoding the protein of SEQ ID 

25 NO:308 in the test sample. RNA levels may be measured using nucleic acid arrays or using 
techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar to 
those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 
compared to RNA levels in control cells from brain, kidney, liver, or cancerous prostate or tissues 

30 other than brain, kidney, liver, or cancerous prostate to determine whether the test sample is from 
brain, kidney, liver, or cancerous prostate. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO.308, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:308 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:308 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:308 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:308. In such techniques, the level of the protein of SEQ ID NO:308 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO;308 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO: 308 which is associated 
with disease. 

10 Protein of SEQ ID NOs:289 and 307 (internal designations 175-l-3-0-E5-CS.cor and 18 7-3 9-0-0- 
k!2-CS) 

The protein of SEQ ID NO:289 is encoded by the cDNA of SEQ ID NO:48. Accordingly, 

it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO: 289 

described throughout the present application also pertain to the polypeptide encoded by the human 
15 cDNA of clone 175-1-3-0-E5-CS. In addition, it will be appreciated that all characteristics and uses 

of the nucleic acid of SEQ ID NO:48 described throughout the present application also pertain to 

the human cDNA of clone 175-1-3-0-E5-CS. 

The protein of the invention consists of 130 amino acids. From the amino acid alignments 

and the hydrophobic ity plots, it has a predicted signal peptide sequence spanning residues 8-20 and 
20 four predicted transmembrane domains spanning residues 2-24, 42-61 , 70-90 and 99-1 19. 

Accordingly, some embodiments of the present invention relate to polypeptides comprising the 

signal peptide and/or one or more of the transmembrane domains. 

The protein of SEQ ID NO:289 encoded by the cDNA of SEQ ID NO:48 is homologous to 

SEQ ID NO: 4199 from EP 1033401-A2 (the disclosure of which is incorporated herein by 
25 reference in its entirety), a human secreted protein. Another protein, SEQ ID NO: 307, encoded by 

the cDNA of SEQ ID NO:66, is a polymorphic variant of the protein of SEQ ID NO:289, and shares 

all of the herein-described functions and uses. 

The present invention relates to a novel protein identified among the cDNAs from a library 

constructed from salivary gland, and to the use of the nucleic acid and amino acid sequences 
30 disclosed herein in the study, diagnosis, prevention, and treatment of disease. Tissue distribution 

analysis predicted by BLAST on databases shows that mRNA encoding this protein was found 

primarily in brain and fetal brain, with lower amounts in kidney, fetal kidney and colon. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 

cytokines. ct-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
35 product of a single gene, a- and P-lFNs are also known as type I IFNs. Type I IFNs are produced in 

a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
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and by various cytokines and growth factors. y-LFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 
5 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, p- and y-IFNs. Other IF1 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 

10 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a-IFN, and renders the cancer cells more susceptible 
to immune rejection. The LFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 

15 New York NY pp 121 1-12 14, the disclosure of which is incorporated herein by reference in its 
entirety). Type II iFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 

The protein of SEQ ID NO:289 is a small hydrophobic protein having chemical and 
structural homology to human interferon-inducible (IFI) protein isoforms 6-16 (97% identity), HIFI 

20 (44%), and p27 (33%), as well as 1 30-5 1 , the chimpanzee homolog of 6-1 6 - (97%). Thus, the 
protein of SEQ ID NO:289 and the nucleic acid encoding it are polymorphic variants of 6-16 or the 
gene encoding 6-16. The protein of SEQ ID NO:289, fragments thereof, or nucleic acids encoding 
the protein of SEQ ID NO:289 or fragments thereof may be used in the diagnosis, study, prevention 
and treatment of disease as described below. 

25 The IFI gene known as 6- 1 6 encodes an mRNA, which is highly induced by type I IFNs in a 

variety of human cells (Kelly JM et al (1986) EMBO J 5: 1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much as 
0.1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the absence 
of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a hydrophobic 

30 protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal peptide. 

Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively charged 
C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 
The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
35 breast tumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 

expression occurs only upon oc-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
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disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 
analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
5 tumors may produce high levels of, or have increased sensitivity to, type 1 IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 
in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IF I gene products may contribute to viral resistance. A hepatitis-C 

10 virus (HCV)-induced gene, 1 30-5 1, was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 
the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of this paper suggest that HCV 
infection actively induces EFN expression, which in turn induces expression of IFI genes including 

15 130-51. The IFI proteins synthesized in response to viral infections are known to inhibit viral 

functions such as penetration, uncoating, RNA or protein synthesis, assembly or release. The 130- 
51 protein may inhibit one or more of these functions in HCV. A particular virus may be inhibited 
in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by IFI 
proteins differs among the virus families (Hardman JG, supra, p 1211, the disclosure of which is 

20 incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 
incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LIFESEQ™ 
database (Incyte Pharmaceuticas, Palo Alto, CA) shows that HIFI mRNA was found only in 

25 neonatal kidney, The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-5 1, respectively. 

The hydrophobic IFI proteins of the invention may provide the basis for clinical diagnosis 
of diseases associated with their induction. These proteins may be useful in the diagnosis and 
treatment of tumors, viral infections, inflammation, or conditions associated with impaired 

30 immunity. Furthermore, these proteins may be used for investigations of the control of gene 
expression by IFNs and other cytokines in normal and diseased cells. 

Based on the chemical and structural homology among the protein of SEQ ID NO:289 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
SEQ ID NO:289 is synthesized when interferons are produced in infections, inflammation, 

35 autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:289 or fragments thereof may be used in diagnosis and treatment of diseases 
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such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 
systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
5 dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 
sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
10 lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e. g. leishmania). 

The protein of SEQ ID NO: 2 89 or fragments thereof may also be used to treat conditions 
associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

1 5 Another embodiment of the present invention relates to the use of the protein of SEQ ID 

NO:289 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 
pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 
embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin lipopolysaccharide (LPS). The methods for using such compositions is described in 

20 Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

Furthermore, the protein of SEQ ID NO:289 or fragments thereof are useful as a reagent for 
analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:289 or fragments thereof may be used to identify 

25 specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 
SEQ ID NO:289 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL-10 and tumor antigens, which may interact with the protein of 
the invention. 

30 The protein of SEQ ID NO:289 or fragments thereof may also be included in 

pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO:289 or fragments thereof is used to inhibit and/or modulate the 
effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 

35 preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 
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The protein of SEQ ID NO:289 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 
either intra peritoneally intravenously, subcutaneously or directly into the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:289 or fragments thereof is useful in 
5 diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnostic assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 
therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 

10 treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 
modify gene expression in tumor and pathogen-infected cells and to influence expression of 
cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 

15 to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 
upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 

20 and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 
DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 
region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 

25 as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ TD NO: 289 or fragments thereof are useful for the 
diagnosis of conditions and diseases associated with its expression and to quantify the protein of the 
invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

30 Fab fragments produced by a Fab expression library. Neutralizing antibodies are especially 

preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 

35 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
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(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme- linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:289, 
5 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 

10 granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 
syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hodgkins), sarcomas, melanomas, adenomas, carcinomas of solid 

15 tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 

genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular HCV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

20 In such embodiments, the protein of SEQ ID NO:289, or a fragment thereof, is 

administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:289. The protein of SEQ ID NO:289 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:289 or a fragment thereof may be administered to the individual. Alternatively, an agent 

25 which increases the activity of the protein of SEQ ID NO:289 may be administered to the 

individual. Such agents may be identified by contacting the protein of SEQ ID NO:289 or a cell or 
preparation containing the protein of SEQ ID NO:289 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

30 Alternatively, the activity of the protein of SEQ ID NO:289 may be decreased by 

administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:289 may be identified by contacting the protein of 
SEQ ID NO:289 or a cell or preparation containing the protein of SEQ ID NO-.289 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 

35 may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix- forming nucleic acid. 
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In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, fetal brain, kidney, fetal kidney, or colon, or to distinguish between two or more 
possible sources of a sample on the basis of the level of the protein of SEQ ID NO:289 in the 
5 sample. For example, the protein of SEQ ID NO:2S9 or fragments thereof may be used to generate 
antibodies using any techniques known to those skilled in the art, including those described therein. 
Such antibodies may then be used to identify tissues of unknown origin, for example, forensic 
samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate 
different tissue types in a tissue cross-section using lrnmunochemistry. In such methods a sample is 

10 contacted with the antibody, which may be detectably labeled, under conditions which facilitate 
antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells from brain, fetal brain, kidney, fetal kidney, or colon or tissues other 
than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test sample is from 
brain, fetal brain, kidney, fetal kidney, or colon. Alternatively, the level of the protein of SEQ ID 

1 5 NO:289 in a test sample may be measured by determining the level of RNA encoding the protein of 
SEQ ID NO:289 in the test sample. RNA levels may be measured using nucleic acid arrays or 
using techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar 
to those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 

20 compared to RNA levels in control cells from brain, fetal brain, kidney, fetal kidney, or colon or 
tissues other than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test 
sample is from brain, fetal brain, kidney, fetal kidney, or colon. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:289, 

25 including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:289 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:289 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 

30 agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:289 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:289. In such techniques, the level of the protein of SEQ ID NO:289 in 
an ill individual is measured using techniques such as those described herein. The level of the 

35 protein of SEQ ID NO:289 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO. 289 which is associated 
with disease. 
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Protein of SEP ID NO:268 (internal de sign ation 1 16-1 1 1-4-0-B3-CS) 

The protein of SEQ ID NO:268 is encoded by the cDNA of SEQ ID NO:27. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:268 
described throughout the present application also pertain to the polypeptide encoded by the human 
5 cDNA of clone 116-11 1-4-0-B3-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO:27 described throughout the present application also pertain 
to the human cDNA of clone 116-11 1-4-0-B3-CS. The protein of the invention is found to be 
expressed in testis and lungs. 

The protein of SEQ ID NO:268 encoded by the extended cDNA SEQ ID NO: 27 is a 

10 splicing variant of XAGE-1, a member of the CT antigen family overexpressed in Ewing sarcoma 
(Liu, X. F., L. J. Helman, et al. (2000). Cancer Res 60(17): 4752-5, the disclosures of which are 
incorporated by reference herein in their entireties). In addition, the protein of SEQ ID NO:268 also 
shows strong homology at the COOH end with PAGE4, another member of the CT antigen family 
(Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the disclosure of which is 

15 incorporated herein by reference in its entirety). 

The cDNA SEQ ID NO:27 is composed of 5 exons. Exon 1 lies between nucleotides 1-245, 
exon2 lies between nucleotides 246-370, exon 3 lies between nucleotides 371-512, exon 4 lies 
between nucleotides 513-639, and exon 5 lies between nucleotides 640-762 . Exons 2 to 5 of cDNA 
SEQ ID NO:27 are shared in part with XAGE-1 . However, since the initiation codon of SEQ ID 

20 NO: 27 is located in intronl of XAGE-1, there is a frameshift in the alignment of the 2 molecules. 
Exon 1 of SEQ ID NO:27 lies between nucleotides 1 10-234 of XAGE-1, exon 2 of SEQ ID NO:27 
lies between nucleotides 235-376 of XAGE-1, exon 3 of SEQ ID NO:27 lies between nucleotides 
377-503 of XAGE-1, and exon 4 of SEQ ID NO:27 lies between nucleotides 504-526 of XAGE-1. 
XAGE-1 is overexpressed in sarcoma and alveolar rhabdomyosarcoma and is also highly 

25 expressed in normal testis (Liu, X. F., L. J. Helman, et al. (2000). Cancer Res 60(17): 4752-5, the 
disclosure of which is incorporated herein by reference in its entirety). In addition XAGE-1 share 
homology with PAGE-4 (Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the 
disclosure of which is incorporated herein by reference in its entirety) at the COOH end. 

CT antigens are a distinct class of di fferenctiation antigens that are expressed by cancers 

30 arising in nonessential normal tissues such as prostate, breast, and ovary (G. Vasmatzis et al., Proc. 
Natl. Acad. Sci. USA, 95: 300-304, 1998, the disclosure of which is incorporated herein by 
reference in its entirety) and that have a restricted pattern of expression in normal tissues. This 
class of antigens are presented on the surface of tumor cells and are recognized by cytolytic T cells, 
leading to lysis. The extent to which these antigens have been studied, has been via cytolytic T cell 

35 characterization studies, in vitro i.e., the study of the identification of the antigen by a particular 
cytolytic T cell ("CTL" hereafter) subset. The subset proliferates upon recognition of the presented 
tumor rejection antigen, and the cells presenting the antigen are lysed. Characterization studies have 
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identified CTL clones which specifically lyse cells expressing the antigens. Examples of this work 
may be found in Levy et al., Adv. Cancer Res. 24: 1-59 (1977); Boon et al., J. Exp. Med. 152: 1 184- 
1 193 (1980); Brunner et al., J. Immunol. 124: 1627-1634 (1 980) ; Maryanski et al., Eur. J. 
Immunol. 124: 1627-1634 (1980); Maryanski et al., Eur. J. Immunol. 12: 406-412 (1982); Palladino 
5 et al., Cane. Res. 47: 5074-5079 (1987), the disclosures of which are incorporated herein by 
reference in their entireties. 

Some throughly studied CT antigens are MAGE, BAGE, GAGE and LAGE, others have 
been added including PAGE, XAGE, most of them located on chromosome X. Brinkmann et Al 
reported the identification of three new members of the GAGE/ PAGE family, termed XAGEs. 
10 XAGE-1 and XAGE-2 are expressed in Ewing's sarcoma, rhabdomyosarcoma, a breast cancer, and 
a germ cell tumor. 

It is believed that the protein of SEQ ID NO.268 is a splicing variant of XAGE-1 , a CT 
antigen overexpressed in Ewing sarcoma. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:268, 

15 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, associated with over or under expression of the 
protein of SEQ ED NO:268. In such embodiments, the protein of SEQ ID NO:268, or a fragment 
thereof, is administered to an individual in whom it is desired to increase or decrease any of the 

20 activity of the protein of SEQ ED NO:268. The protein of SEQ ID NO:268 or fragment thereof may 
be administered directly to the individual or, alternatively, a nucleic acid encoding the protein of 
SEQ ID NO:268 or a fragment thereof may be administered to the individual. Alternatively, an 
agent which increases the activity of the protein of SEQ ID NO:268 may be administered to the 
individual. Such agents may be identified by contacting the protein of SEQ ID NO:268 or a cell or 

25 preparation containing the protein of SEQ ID NO:268 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:268 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 

30 with the activity of the protein of SEQ ED NO:268 may be identified by contacting the protein of 
SEQ ID NO:268 or a cell or preparation containing the protein of SEQ ID NO:268 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

35 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify tissues, preferably testis and 
lungs, or to distinguish between two or more possible sources of a tissue sample on the basis of the 
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level of the protein of SEQ ID NO:268 in the sample. For example, the protein of SEQ ID NO:268 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue-specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
5 that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 
cross-section using immunochemistry. In such methods a tissue sample is contacted with the 
antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 

10 sample is from testis or lungs. Alternatively, the level of the protein of SEQ ID NO;268 in a test 
sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:268 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic 

15 acid sample pnor to analysis. The level of RNA in the test sample is compared to RNA levels in 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 
sample is from testis or lungs. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:268, 

20 including Ewing sarcoma cells, rhabdomyosarcoma cells, breast cancer cells and germ cell tumor 
cells using methods known to those skilled in the art. For example, an antibody against the protein 
of SEQ ED NO:268 or a fragment thereof may be fixed to a solid support, such as a chromatograpy 
matrix. A prepartation containing cells expressing the protein of SEQ ID NO:268 is placed in 
contact with the antibody under conditions which facilitate binding to the antibody. The support is 

25 washed and then the cells are released from the support by contacting the support with agents which 
cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ED NO:268 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:268. In some embodiments, the protein of SEQ ID NO:268 or fragments 

30 thereof may be used to diagnose Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell 
tumors. In such techniques, the level of the protein of SEQ ID NO:268 in an ill individual is 
measured using techniques such as those described herein. The level of the protein of SEQ ID 
NO:268 in the ill individual is compared to the level in normal individuals. An elevated level or 
decreased level of the protein of SEQ ID NO:268 relative to normal individuals suggests that the ill 

35 individual is suffering from a defect in intercellular communication or secretion. 

Another embodiment of the invention relates to compositions and methods using the protein 
of SEQ ID NO:268 or a fragment thereof as possible targets for vaccine-based therapies of cancer, 
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including Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell tumors. In such 
embodiments, an antibody against against the protein of SEQ ID NO:268 or a fragment thereof is 
administered to an individual suffering from cancer in an amount sufficient to ameliorate or 
eliminate the cancer. 

5 Protein of SEQ ID NO:399 ( internal designation ( 1 60-40- 1 -Q-H4-CS) 

The protein of SEQ ID NO:399 is encoded by the cDNA of SEQ ID NO: 158. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:399 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 60-40-1 -0-H4-CS. In addition, it will be appreciated that all characteristics and 
0 uses of the nucleic acid of SEQ ED NO: 1 58 described throughout the present application also 

pertain to the human cDNA of clone 1 60-40- 1-0-H4-CS. The protein of the invention is found to be 
expressed in testis and lungs. It is over represented in fetal brain. 

The protein of SEQ ID NO:399 encoded by the cDNA of SEQ ID NO: 158 is homologous to 
proteins of the Phosphatic Acid Phosphatase type 2 (PAP2) superfamily (Stukey J. and Carman 
5 G.M., Protein Sci 1997;6 :469-472, the disclosure of which is incorporated herein by reference in its 
entirety). Three variants of human PAP, i.e. PAP-alpha 2 (W79285) and its alternatively spliced 
form PAP-alpha 1 (W79284), PAP-beta (W79286) and PAP-gamma (W79287) have been 
identified. The protein of SEQ ID NO:399 displays a pfam characteristic domain of the PAP2 
superfamily from positions 19 to 175. Accordingly, one embodiment of the present invention is a 
polypeptide comprising amino acid residues 19 to 175 of SEQ ID NO:399. Four membrane 
spanning domains are predicted from amino acid ositions 17 to, 47 to 67, 108 to 128, and 141 to 
161. Accordingly, another embodiment of the present invention is a polypeptide comprising one or 
more of the foregoing membrane spanning domains. 

Phosphatidic acid phosphatase (PAP) (also referred to as phosphatidate phosphohydrolase) 
is known to be an important enzyme for glycerolipid biosynthesis. In particular, PAP catalyzes the 
conversion of phosphatidic acid (PA) into diacylglycerol (DAG). PA and DAG are lipids involved 
in signal transduction and in structural membrane-lipid biosynthesis in cells, thus they represent an 
important regulatory point in eukariotic phospholipid metabolism. DAG is a well-studied lipid 
second messenger which is essential for the activation of protein kinase C (Kent; Anal. Rev. 
Biochem. ; 64 : 3 15-343; 1995; whereas PA itself is also a lipid messenger implicated in various 
signaling pathways such as NADPH oxidase activation and calcium mobilization (English; Cell 
Signal.; 8:341-347 ; 1996, the disclosure of which is incorporated herein by reference in its entirety). 
The regulation of PAP activity can therefore affect the balance of divergent signaling processes that 
the cell receives in terms of PA and DAG (Brindley et al.; Chem.Phys. Lipids 80:45-57 ; 1996, the 
disclosure of which is incorporated herein by reference in its entirety). 
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PAP exists in at least two isoforms, one of which (PAP1) is presumed to be cytosolic and 
membrane associated and the other (PAP2) to be an integral membrane protein (Leung D.W., 
Tompkins C.K., White T. ; DNA Cell Biol. 17 : 377-385 (1998)). The protein of the invention has 
180 amino-acids and four predicted membrane-spanning segments, so is presumed to be an integral 
5 membrane protein. 

The protein of SEQ ID NO:399 is encoded by a cDNA that has homology to many forms 
of alternative splicing of PAP2 genes. For example, the protein of SEQ ID NO:399 has 29% 
homology with human phosphatide acid phosphohydrolase type-2C protein. The protein of SEQ ID 
NO: 399 also has 40% homology with human phosphatidic acid phosphatase 2B protein. In 
0 addition, the protein of SEQ ID NO:399 has 33% homology with human type 2 phosphatidic acid 
phosphatase alpha-2 protein. PAP2-alpha2 is one of the two isoforms with PAP2-alphal, presumed 
to be alternative splice variants from a single gene. 

Northern analysis has shown that PAP2 -alpha mRNA expression was suppressed in several 
tumor tissues, indicating that PAP-2 may act as a tumor suppressor. The relationship of PAP and 
5 tumor suppression is further evidenced in findings that PAP activity is lower in fibroblast cell lines 
transformed with either the ras or fps oncogene than in the parental rati cell line (Brindley et al ; 
Chem. Phys. Lipids 80 : 45-57 ; 1 996, the disclosure of which is incorporated herein by reference in 
its entirety). As discussed above, a decrease in PAP activity in transformed cells correlates with a 
concomitant increase in PA concentration. Moreover, elevated PAP activity and lower levels of PA 
have been observed in contact-inhibited fibroblasts relative to proliferating and transformed 
fibroblasts (Brindley et al ; Chem. Phys. Lipids 80: 45-57; 1996, the disclosure of which is 
incorporated herein by reference in its entirety). Therefore, the protein of SEQ ID NO:399 or 
fragments thereof may be used to decrease cell division and as such can provide a useful tool in 
treating cancer. Subsequent analysis of colon tumor tissue derived from four donors confirmed 
lower expression of PAP2 -alpha than in matching normal colon tissue. Considering these data and 
previous demonstrations that certain transformed cell lines have lower PAP activity, human PAP 
cDNAs may be used for gene therapy for certain tumors (Leung D. W., Tompkins C.K., White 
T. ; DNA Cell Biol. 17 : 377-385 (1998), the disclosure of which is incorporated herein by reference 
in its entirety). Accordingly, one embodiment of the present invention is the use of the protein of 
SEQ ED NO:399 or a fragment thereof as a tumor suppressor. For example, a nucleic acid 
expressing the protein of SEQ ID NO:399 or a fragment thereof may be introduced into an 
individual suffering from cancer in order to ameliorate or eliminate the cancer. In fact, nucleic 
acids encoding human phosphatidic acid phosphatases have been used to regulate levels of lipid 
cellular mediators and in gene therapy of e.g. cancer (PCT publication WO98/46730, the disclosure 
of which is incorporated herein by reference in its entirety). 

In another embodiment of the present invention, the protein of SEQ ID NO:399 or a 
fragment thereof can be used to control the balance of lipid mediators of cellular activation and 
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signal transduction. The protein of the invention has 33% homology with human phosphatide acid 
phosphatase 2A protein. PAP2A is an integral membrane glycoprotein at the cell surface that plays 
an active role in the hydrolysis and uptake of lipids from the extracellular space (Roberts RZ, 
Morris AJ; Biochim Biophys Acta 2000 Aug 24; 1487(l);33-49, the disclosure of which is 
5 incorporated herein by reference in its entirety). Accordingly, the level or activity of the protein of 
SEQ ID NO:399 may be modulated to influence the rate or extent of hydrolysis and uptake of lipids 
from the extracellular space using methods such as those described herein. 

In another embodiment of the present invention, the protein of SEQ ID NO:399 can be used 
to counterbalance the inflammatory response. PA has been implicated in cytokine induced 
10 inflammatory responses (Bursten et al; Circ. Shok 44: 14-29, 1994; Abraham et al; J. Exp. Med. 
181: 569-575, 1995; Riceetal; PNAS 91: 3857-3861, 1994; Leung et al; PNAS 92: 4813-4817, 
1995, the disclosures of which are incorporated herein by reference in their entireties) and the 
modulation of numerous protein kinases involved in signal transduction (English et al ; Chem. Phys. 
Lipids 80: 1 17-132, 1996, the disclosure of which is incorporated herein by reference in its 
15 entirety). In addition, a nucleic acid encoding the protein of SEQ ID NO:399 or a fragment thereof 
may be used to counterbalance the inflammatory response from cytokine stimulation through 
degradation of excess amount of PA in cells or to treat or ameliorate inflammatory diseases. 

The gene encoding the protein of SEQ ID NO: 399 or a fragment thereof can also be used in 
gene therapy for the treatment of obesity associated with diabetes. PAP activity is decreased in the 
20 livers and hearts of the grossly obese and insulin resistant JCRrLA corpulent rat compared to the 
control lean phenotype (Brindley et al ; Chem. Phys. Lipids 80 : 45-57 ;1996, the disclosure of 
which is incorporated herein by reference in its entirety). The protein of the invention therefore can 
provide an important tool for the treatment of obesity associated with diabetes. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:399 , 
25 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, in an individual. In such embodiments, the 
protein of SEQ ID NO:399 , or a fragment thereof, is administered to an individual in whom it is 
desired to increase or decrease any of the activities of the protein of SEQ ID NO:399, including 
30 glycerolipid biosynthesis, conversion of phasphatidic acid into diacylglycerol, signal transduction, 
membrane-lipid biosynthesis, activation of protein kinase C, NADPH oxidase activation, calcium 
mobilization, cell division, production of diacylglycerol, monoacylglycerol, ceramide or 
sphingosine, modulation of the inflammatory response or dephosphorylation of a substrate such as 
lysophasphatidic acid, ceramide 1 -phosphate, or sphingosine 1 -phosphate, or treatment or 
35 amelioration of obesity associated with diabetes. The protein of SEQ ID NO:399 or fragment 

thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 
protein of SEQ LD NO:399 or a fragment thereof may be administered to the individual. 
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Alternatively, an agent which increases the activity of the protein of SEQ ID NO:399 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:399 or a cell or preparation containing the protein of SEQ ID NO;399 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 
5 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ED NO:399 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:399 may be identified by contacting the protein of 
SEQ ID NO:399 or a cell or preparation containing the protein of SEQ ID NO:399 with a test agent 
10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably brain, or 
1 5 to distinguish between two or more possible sources of a tissue sample on the basis of the level of 
the protein of SEQ ID NO:399 in the sample. For example, the protein of SEQ ID NO;399 or 
fragments thereof may be used to generate antibodies using any techniques known to those skilled 
in the art, including those described therein. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 
20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a tissue sample is contacted with the antibody, 
which may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 
25 Alternatively, the level of the protein of SEQ ID NO:399 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO:399 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PGR reaction, may be performed on the nucleic acid sample prior 
30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:399 , 
including using methods known to those skilled in the art. For example, an antibody against the 
35 protein of SEQ ID NO:399 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ CD NO:399 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ED NO:399 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:399. In some embodiments, the protein of SEQ ID NO:399 or fragments 
thereof may be used to diagnose cancer. In such techniques, the level of the protein of SEQ ID 
NO:399 in an ill individual is measured using techniques such as those described herein. The level 
of the protein of SEQ CD NO:399 in the ill individual is compared to the level in normal individuals. 
An elevated level or decreased level of the protein of SEQ ID NO: 3 99 relative to normal individuals 
10 suggests that the ill individual may suffer from cancer or be predisposed to getting cancer in the 
future. 

In another embodiment, the present invention relates to methods of preparing a PAP protein 
of SEQ ID NO:399 comprising the steps of (i) transforming a host cell with an expression vector 
comprising a polynucleotide encoding SEQ ID NO:399, (ii) culturing the transformed host cells 

15 which express the protein and (iii) isolating the protein. The present invention also relates to a 
method of dephosphorylating a substrate comprising contacting the substrate with an effective 
amount of isolated protein of SEQ ID NO:399 or a fragment thereof such that the protein catalyzes 
the dephosphorylation of the substrate. It is further provided that this method occurs in vitro, and 
comprises a step of isolating the dephosphorylated substrate. Additionally, the method can occur in 

20 vivo, and is effected by the administration of the protein of the invention (or part of it) to a mammal 
in need thereof. 

Protein of SEQ ID NOs:258 and 262 (internal designations 1 10-007-1 -0-C7-CS, 1 16-055-1 -Q-A3- 
CS): 

The protein of SEQ ID NO:258 is encoded by the cDNA of SEQ ID NO: 17. Accordingly, 
25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:258 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 10-007-1 -0-C7-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO: 17 described throughout the present application also pertain 
to the human cDNA of clone 1 10-007-1 -0-C7-CS. The protein of SEQ ID NO:258 shows 
30 homologies to two high affinity IgE receptor-like proteins (IGER) with GENESEQP accession 
numbers W96745 and W41056, the disclosures of which are incorporated herein by reference in 
their entireties. The protein of SEQ ID NO:258 is expressed in liver and testis. The protein of SEQ 
ID NO:262, encoded by SEQ ID NO:21, is a variant of the protein of SEQ ED NO:258 and shares 
all the potential uses and functions described herein. This protein and cDNA share all of the 
35 characteristics and uses of the clone, and product thereof, 1 1 6-055-1 -0-A3-CS). 
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Like the two high affinity IgE receptor-like proteins, the protein of the invention contains 
four transmembrane spanning domains of 20 amino acids, between amino acids 53-73, 79-99, 121- 
141 and 158-178, respectively. The protein of SEQ ID NO;258 crosses the plasma membrane four 
times forming two small extracellular loops and has both the N- or C- terminals in the cytoplasm. 
5 Moreover, the protein of the invention contains a signal peptide (cleavage site at position 21). 

The predicted structure of the protein of SEQ ID NO:258 demonstrates the relationship of 
this protein to FceRIp and CDC20 antigen and provides evidence for a family of 4 -transmembrane 
spanning proteins. The conservation of amino acids between all three proteins is highest in the four 
transmembrane domains. While greater divergence exists in the hydrophilic amino and carboxyl 
10 termini, several amino acids within these regions are conserved such as the presence of 4 prolines in 
the amino terminus of all three proteins. In addition, two cysteine residues (position 147 and 156) 
are present in the second extracellular domain between TM3 and TM4. This suggests that inter- or 
intra -molecular di-sulfite bonds in this domain are present in all three proteins. 

FceRI, is part of a tetrameric receptor complex consisting of an a chain, a p chain and two y 
15 chains (Kinet et al. Proc Natl. Acad. Sci. USA, 15: 6483-6487 (1988), the disclosure of which is 
incorporated herein by reference in its entirety). Together, they mediate interaction with IgE-bound 
antigens leading to dramatic cellular responses, such as the massive degranulations of mast cells. 
The p subunit is a 4-transmembrane protein with both the amino and carboxyl termini residing in 
the cytoplasm. 

20 Chromosome mapping localized cDNA of SEQ ID NO: 17 to chromosome 1 lql2, the 

location of the CD20 gene. However, the murine FceRIp and Ly-44 (the murine equivalent of 
CD20) are both located in the same position in mouse in chromosome 19 (Teder, T.F. et al., J. 
Immunol. 141:4388-4394 (1988), Clark E.A. and Lane, J.L. Annu. Rev. Immunol. 9:97-127 (1991), 
the disclosures of which are incorporated herein by reference in their entireties). Therefore, the 

25 three genes are believed to have been originated and evolved from the same locus, further 
supporting the proposition that they are members of the same family of related proteins. 

On the basis of the foregoing information, it is believed that the protein of SEQ ID NO:258 
is a high affinity immunoglobulin E receptor-like protein. 

Atopic diseases, which include allergy, asthma, atopic dermatitis (or eczema) and allergic 

30 rhinitis are generally defined as a disorder of Immunoglobulin E (IgE) responses to common 

antigens, such as pollen or house dust mites. It is frequently detected by either elevated total serum 
IgE levels, antigen specific IgE response or positive skin tests to common allergens. In principle, 
atopy can result from dysregulation of any part of the pathway which begins with antigen exposure 
and IgE response to the interaction of IgE with its receptor on mast cell, the high affinity Fc 

35 receptor FcsRI, and the subsenquent cellular activation mediated by that Hgand-receptor 

engagement (Ra vetch, Nature Genetics, 7: 117-118 (1994), the disclosure of which is incorporated 
herein by reference in its entirety). 
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Accordingly, the protein of SEQ ID NO:258 or fragments comprising at least 5, 8, 10, 12, 
15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 consecutive ammo acids thereof, or fragments 
having a desired biological activity may administered to an individual in whom it is desired to 
increase or decrease the activity of the protein of SEQ ID NO:258. In particular, the protein of SEQ 
5 ID NO:258 or fragment thereof may be administered to an individual in whom it is desired to 
regulate the extent of the IgE response. In such methods, the protein of SEQ ID NO:258 or 
fragment thereof may be administered directly to the individual or, alternatively, a nucleic acid 
encoding the protein of SEQ ID NO:258 or a fragment thereof may be administered to the 
individual. Alternatively, an agent which increases the activity of the protein of SEQ ED NO:258 

10 may be administered to the individual. Such agents may be identified by contacting the protein of 
SEQ ID NO: 25 8 or a cell or preparation containing the protein of SEQ ID NO:258 with a test agent 
and assaying whether the test agent increases the activity of the protein. For example, the test agent 
may be a chemical compound or a polypeptide or peptide. 

The protein of SEQ ID NO:258 or fragments thereof may also be used to identify genes or 

15 polypeptides that may play a role in IgE responses or atopic disease. In particular, binding partners 
for the protein of SEQ ID NO:258 or the genes encoding such binding partners may be identified 
using a variety of techniques familiar to those skilled in the art, including the techniques described 
herein. 

The protein of SEQ ID NO:258 or the polynucleotide encoding the protein of SEQ ID 

20 NO:25 8 may also be used to diagnose hereditary atopy. In particular, the level of the protein of 
SEQ ED NO:258 may be determined in a test individual using methods such as those described 
herein and compared to the levels of normal individuals and individuals suffering from hereditary 
atopy to determine whether the test individual is suffering from or at risk of suffering hereditary 
atopy. Alternatively, a nucleic acid sample may be obtained from a test individual and analyzed to 

25 determine whether it contains a level of RNA encoding the protein of SEQ ID NO:258 which is 
associated with hereditary atopy or a mutation in the gene encoding the protein of SEQ ID NO:258 
which is associated with hereditary atopy. For example, a nucleic acid sample from the test 
individual may be contacted with a nucleic acid probe comprising the nucleic acid encoding the 
protein of SEQ ID NO:258 or a fragment thereof to determine the RNA level or whether the 

30 individual has a mutation associated with hereditary atopy. The probe may be either DNA, 

including cDNA or genomic DNA, or the probe may be RNA. Any of the methods familiar to those 
skilled in the art may be used in these diagnostic methods, including the methods described herein. 
For example, the presence of a mutation associated with hereditary atopy can be determined using 
methods generally known in the art, such as but not limited to PCR, sequencing or mini sequencing 

35 as described in the method of Yamamoto et al. (Biochem. Biophys. Res. Comm., 182:507 (1992), 
the disclosure of which is incorporated by reference herein in its entirety). 
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The protein of SEQ ED NO:258 can also be used to characterize the induction of expression 
of FceRI and the particular function of FceRIp. As such, the protein of the invention can be useful 
in, for example, the design of drugs that block or inhibit induction or activity of FceRJ, thereby 
treating atopic diseases. In particular, test agents which block or inhibit induction or activity may 
5 be identified using the methods described herein. 

In an other embodiment, the protein of SEQ LD NO: 258 can be employed in the preparation 
of antibodies, such as monoclonal antibodies, according to methods known in the art, including 
those described herein. The antibodies can be used to block or mimic ligand binding to the receptor 
comprising the protein of the invention or other receptors, such as but not limited to FceRI. The 

10 antibodies can also be used to isolate the protein of SEQ ID NO:258 or cells which express the 
protein of SEQ ID NO:258 using methods such as those described herein. For example, the 
antibodies may be used to measure the presence of cells containing the protein of SEQ ID NO:258 
(including but not limited to hematopoietic cells) in a sample. For example, the method comprises 
contacting the sample with the antibody under conditions sufficient for the antibody to bind to the 

15 protein of SEQ ID NO:258 and detecting the presence of bound antibody using methods known in 
the art, including those described herein. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably liver and 
testis, or to distinguish between two or more possible sources of a tissue sample on the basis of the 

20 level of the protein of SEQ ID NO:258 in the sample. For example, the protein of SEQ ID NO:258 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue -specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 

25 cross-section using immunochemistry. In such methods a tissue sample is contacted with the 

antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. Alternatively, the level of the protein of SEQ ID NO: 25 8 in a test 

30 sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:258 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PGR reaction, may be performed on the nucleic 
acid sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in 

35 control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. 
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Protein of SEP ID NO:279 (internal designation 160-58-3-0-H3-CS) 



PCT/IB00/01938 



The protein of SEQ ID NO;279 is encoded by the cDNA of SEQ ED NO:38. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:279 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
5 acid included in clone 160-58-3-0-H3-CS. In addition, it will be appreciated that all characteristics 
and uses of the nucleic acid of SEQ ED NO: 38 described throughout the present application also 
pertain to the nucleic acid included in clone 160-58-3-0-H3-CS. 

The protein of SEQ ID NO:279 is encoded by a nucleic acid of 1 330 nucleotides with an 
ORF between nt 198 to 998 yielding a 267 amino acid protein. The protein is a polymorphic variant 
10 of the sequence (SP:P01210) for proenkephalin A precursor (contains Met- and Leu- enkephalins). 
It has a signal peptide spanning 24 amino acid and 2 signature motifs for vertebrate endogenous 
opioid neuropeptides and endogenous opioid neuropeptide precursors. PSORT gives a predicted 
extracellular localization, including the cell wall (66.7%). The protein of SEQ ID NO:279 is 
primarily distributed the fetal brain, although expression in other tissues has also been shown (see 
15 below). The polymorphic variation is found at amino acid position 75 (E->D, a conservative amino 
acid change). After signal peptide cleavage (amino acid 47 to 267; 220 amino acid), the protein still 
contains the polymorphic variation, which is now at amino acid position 29. This does not change 
any of the sequence of the different enkephalins that result after cleavage of this precursor protein. 
In addition, the polymorphism is 25 amino acids away from the first cleavage site on the amino 
20 terminal side. This is unlikely to change the secondary structure of the actual cleavage site. 

PCT publication WO9606863-A1, the disclosure of which is incorporated herein by 
reference in its entirety, discloses a protein having high homology with the protein of SEQ ID 
NO:279. Accordingly, the protein of SEQ ID NO:279 is believed to be an enkephalin. Met- 

and Leu- enkephalins compete with and mimic the effects of opiate drugs. These two pentapeptides 
25 with potent opiate agonist activity in bioassay systems were originally identified by Hughes et al 
(Nature, 258, 577-580, 1 975). The natural ligands for opiate receptors, which differ only in their 
COOH terminal amino acid, were named Met- and Leu-enkephalin to reflect their origin from the 
brain. Peptides containing these sequences are termed opiate or opioid peptides. Enkephalins are 
widely distributed throughout the central nervous system in enkephalinergic neuronal networks, and 
30 also exist in the peripheral nervous system, for example in autonomic ganglia. Data, largely 
circumstantial, suggest wide-ranging involvement of endogenous opioids for example in the 
modulation of pain perception, in mood and behaviour, learning and memory, responses to stress, 
diverse neuroendocrine functions, immune regulation and cardiovascular and respiratory function. 

Met-enkephalin enhances the immune reaction in patients with cancer or AIDS. It can bind 
35 opoid receptors present in peripheral inflamed tissues to mediate an analgesic effect. 

After exogenous administration of the different enkephalins, several immunologic functions 
are affected, including antibody production, NK cell activity against tumors and viral infections, 
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macrophage and polymorphonuclear leukocyte functions, graft rejections, and mitogen-stimulated 
lymphocyte proliferation. The effects can be bi-directional, where low concentrations enhance, and 
high concentrations inhibit the same immune function. Thus, enkephalins are modulators of 
immune reactions. 

5 These opioid neuropeptides are released by post-translational proteolytic processing of 

precursor proteins. These multivalent precursor proteins (polyprotein) consist of a signal sequence 
followed by a conserved region of about 50 residues, a variable length region and the sequence of 
the various neuropeptides. The preproenkephalin A (gene PENK) is processed to produce the 
following peptides which include Met-enkephalin (6 copies, 2 of which are extended) and Leu- 
10 enkephalin: 

Signal peptide 1-24 
Peptide 100-104 Met-enkephalin 1 
Peptide 107-1 1 1 Met-enkephalin 2 
Peptide 136-140 Met-enkephalin 3 
15 Peptide 186-193 Met-enkephalin-arg-gly-leu 

Peptide 210-214 Met-enkephalin 4 
Peptide 230-234 Leu-enkephalin 
Peptide 261-267 Met-enkephalin-arg-phe 

The conserved region in the N-termini of these precursors contains six cysteines that are 
20 probably involved in disulfide bonds. This region could also be important for the processing of the 
neuropeptides. 

The precursor protein does have the potential to be differentially cleaved into multiple 
extended enkephalin and non-enkephalin-containing peptides, the functions of which are largely 
unknown; however, in some cases it has been shown that extended enkephalin-containing peptides 
25 have enhanced opiate activity. Another peptide, enkelytin, is produced that exhibits anti-bacterial 
activity (see below). 

There is a growing body of evidence that proenkephalin exists largely independently of free 
enkephalin peptides in a number of tissues and cell types including astrocytes (Melner et al, EMBO 
J, 9, 791-796, 1990; Spruce et al, EMBO J 9, 1787-1795, 1990, the disclosures of which are 

30 incoporated herein by reference in their entireties), and is released from these cells in an 

unprocessed form (Batter et al, Brain Res. 563, 28-32, 1991, the disclosure of which is incorporated 
herein by reference in its entirety). There is evidence in some cases that processing enzymes are co- 
released along with the unprocessed precursor which suggests that extracellular cleavage may occur 
(Vilijn et al, J. Neurochem. 53, 1487-1493, 1989). Even if biological activity is signalled through 

35 binding of the small peptide products to cell surface receptors, the regulation of this activity may be 
mediated through the precursor, and it is also possible that the unprocessed precursor has an 
additional intracellular role of its own. 
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This protein was originally described to be present in various brain regions, most notably in 
the striatum as well as in neuroendocrine tissues, the pituitary and adrenal gland. It is also 
expressed in a variety of immune cells, including ConA-stimulated CD4 Tlymphocytes, CD4 
thymocytes, B lymphocytes, as well as T cell lines, macrophages and mast cells. Expression has 
5 been reported in the reproductive system, heart and many developing tissues during gestation and 
early postnatal period Because of this, it has been postulated that these peptides play a role in cell 
or tissue growth and differentiation. For example, endogenous enkephalins induced in thymocytes 
modulate their own expression and function to inhibit the proliferation of activated thymocytes. 

Enkephalin peptides are abundant in adrenal medulla and can be released by 
10 neurotransmitters specific for that tissue. Enkephalins have also been found to be abundant in 
human phaeochromocytoma, a tumour derived from the adrenal medulla. The RNA from this 
tumour contains a high level of enkephalin mRNA sequences as demonstrated by cell-free 
translation studies. 

Enkephalins function as opiate receptors are classified as delta, kappa and mu. A study by 

1 5 Lord et al (Nature, 267, 495-499, 1977) compared the activity of morphine and enkephalins in 
bioassay systems, and found that enkephalins bound predominantly to delta receptors. Subsequent 
studies have revealed homology of these receptors to other receptor families, including the 
immunoglobulin superfamily member OBCAM (Schofield et al, EMBO J 8, 489-495, 1989, the 
disclosure of which is incorporated herein by reference in its entirety) and somatostatin receptors 

10 (PCT publication WO96/06863, the disclosure of which is incorporated herein by reference in its 
entirety). This would explain the reported opioid binding properties of the former. Because of the 
latter' s homology to opiate receptors, it would also be expected to bind opioid receptor ligands. 
The recognition of opioid peptides by other non-opiate related receptors implies that these peptides 
may exert other as yet unknown functions. 

5 Enkephalins are also involved in apoptosis. Apoptosis is the morphologically distinct 

process of controlled cell death which balances the process of cell production by mitosis, A 
molecular connection between control of cell production and cell elimination has now been 
established, including the roles of c-myc and p53 in the pathways mediating apoptotic cell death. It 
has been proposed that all mammalian cells may be programmed to die by default in the absence of 

0 continuous signalling from neighboring cells. However, the acquisition of a survival advantage 
which prevents a single cell from activating its suicide program in response to levels of genetic 
damage associated with common environmental insults could theoretically be an initiating event in 
oncogenesis since it would favor the persistence of potentially tumongenic mutations. 
Alternatively, inappropriate activation of survival pathways might lead to overriding the intrinsic 

5 death program and promote tumongenesis at early and late stages. A particularly potent oncogenic 
pathway would be one which both promoted and tolerated genetic damage and helped a cell 
overcome its need for extracellular survival signals. Approximately 50% of human tumors possess 
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normal p53 function. Thus, additional pathways or molecules which inappropriately repress 
apoptosis in human tumours remain to be identified. Opioid-like molecules could be involved in 
such a pathway. 

There are published reports that pathways which include opioid-like molecules participate 
5 in regulating the equilibrium between cell death and survival. For example, morphine inhibits cell 
survival in the developing cerebellum (Hauser et al, Exp. Neurol, 130, 95-105, 1994, the disclosure 
of which is incorporated herein by reference in its entirety) and induces apoptosis in thymocytes 
(Fuchs and Pruett, J. Pharmacol. Exp. Ther. 266, 417-423, 1993, the disclosure of which is 
incorporated herein by reference in its entirety). 
10 In a series of experiments (PCT publication WO 96/06863), it has been found that 

proenkephalin and/or its proteolytic products act as extracellular and/or cell surface membrane 
bound factors which modulate cell survival in transformed cells a) upon deprivation of exogenous 
survival factors, and b) following genotoxic injury and/or stress when exogenous survival factors 
are non-limiting. The receptor(s) to which these factor(s) bind, which are most likely to exist on the 
15 cell surface are related, or possibly identical, to one or more members of the opioid receptor family. 
Opioid-like receptor types or subtypes can mediate survival or death; receptor (s) 
whichmediate death appear to be coupled to those which mediate survival. Natural ligands for these 
receptors are likely to be products of the opioid precursor genes, although natural ligands could 
include cytokines which mimic their effect. Tumour cells are more sensitive to antagonism of 
20 opioid-like receptor-mediated survival, and to stimulation of opioid-like receptor-mediated death, 
than non-transformed cells. The induction of cell cycle arrest enhances the sensitivity of rumour 
cells to thesemampulations. (Enhanced sensitivity of tumour cells to these manipulations is induced 
by their synchronisation within the cell cycle. 

Cytoplasmic proenkephalin and/or its proteolytic products act as general repressors of 
25 apoptosis. Agents which, if coupled to appropriate internalisation agents, would antagonise 
cytoplasmic proenkephalin would therefore be of use in the induction of apoptosis in 
non-transformed as well as transformed cells, particularly in combination with sublethal doses of 
known apoptosis-inducing agents. 

The repression of apoptosis mediated through cytoplasmic proenkephalin is activated at 
30 high cell density predominantly by nondiffusable factors. Inhibition of proenkephalin or its products 
as described above would therefore be potentiated if agents were used in combination for example 
with neutralising antibodies to integrins (such as the antibody 23C6- Bates et al., J. Cell Biol. 125 
403-415, 1994) to reduce exogenous survival signaling and simulate low density. 

Proenkephalin targeted to the cell nucleus induces apoptotic death, which is inhibited by the 
35 overexpression of large T antigen and is at least partly mediated through p53. Tumors which retain 
wild-type p53 function are therefore a particular target for apoptosis induction by agents which 
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increase the levels of proenkephalin, or its derivatives, within the nucleus or which mimic the 
function of nuclear proenkephalin or its derivatives. 

Accordingly, the protein of SEQ ID NO:279, fragments thereof, or nucleic acids encoding 
the protein of SEQ ID NO:279 may be used to modulate a biochemical pathway in which products 
5 of opioid peptide precursor genes participate. In some embodiments, antibodies or other agents 
which reduce the level or activity of the protein of SEQ ID NO:279 or fragments thereof may be 
used to induce apoptosis in cells. The agents preferably neutralize the protein of SEQ ED NO:279 
or its proteolytic derivatives, increase the level of, activate or mimic nuclear proenkephalin, or act 
as an antagonist to receptors related or identical to the delta and kappa opioid receptors. In some 

10 embodiments, the agent may be a neutralizing monoclonal antibody against the protein of SEQ ID 
NO:279 or a fragment thereof. The agent may also be a fragment or allelic form of one of these 
antibodies. A cytoplasmic anchor, or a nuclear localization signal may also be included in the 
agent. In some embodiments, the agent is able to modulate a biochemical pathway in a cell in 
which products of opioid peptide precursor genes participate in order to induce apoptosis. The 

15 agents can be used for the treatment of cancer or for inducing apoptosis in lens cells following a 
cataract operation. In some embodiments, the agents promote apoptosis of proliferating cells with 
less, or no, effect on normal mature cell types. The agents may be administered in combination 
with a genotoxic or cell cycle arrest agent. Alternatively, the agent may be complexed with a 
chemotherapeutic, irradiation or cell cycle arrest (synchronization agent). 

20 Accordingly, the invention provides a means of inducing apoptosis in cells which comprises 

modifying a biological pathway of a cell in which a product of an opioid precursor gene participates 
in such a way that apoptosis is induced. Modification of the pathway is suitably effected by 
adminstration of an appropriate agent. In particular, the present invention provides an agent for use 
in inducing apoptosis in cells, said agent comprising an agent able to neutralise proenkephalin or its 

25 proteolytic derivatives; an agent which increases the level of nuclear proenkephalin and/or its 
derivatives, or which activates or mimics them an agent which acts as an antagonist at receptor(s) 
related or identical to the delta opioid receptor, or an agent which acts as an agonist at receptor(s) 
related or identical to the kappa opioid receptor. 

A subset of such agents are agents able to neutralise proenkephalin or its proteolytic 

30 derivatives, or an agent which acts as an antagonist at receptor(s) related or identical to the delta 
opioid receptor, or an agent which acts as an agonist at receptor(s) related or identical to the kappa 
opioid receptor. 

In some embodiments, the agent may be administered to the cell surface whereupon the 
survival effects of extracellular and/or cell surface membrane bound proenkephalin or its proteolytic 
35 derivatives is neutralised causing the cell to become apoptotic. Alternatively, an agent able to 

neutralise proenkephalin or its proteolytic derivatives may be coupled to an internalisation peptide 
and a cytoplasmic anchor. Such an assembly will remain in the cytoplasm of the cell, antogonising 
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cytoplasmic proenkephalin and/or its proteolytic products and thus neutralising the apoptosis 
repressor effect of these molecules. 

Enkephalins also have anti-bacterial activity. During processing of the proenkephalin- A, 
the maturation in the adrenal medullary chromaffin cell starts with the removal of the carboxy- 
5 terminal end (proenkephalin-A-derived peptide or PEAP209-239) ( Y - Goumon, K. Lugardon, B. 
Kieffer et al. J. Biol. Chem. 273:29847-29856, 1998, the disclosure of which is incorporated 
herein by reference in its entirety). The peptide enkelytin was identified as corresponding to 
bisphosphorylated PEAP 209 . 237 , and possesses antibacterial activity including Staphylococcus aureus 
and other gram-positive bacteria such as Micrococcus luteus and Bacillus megaterium (0.2-0.4 
10 range) There is no ability to affect gram-negative bacteria (E. coli strain D22, D3 1 , 663 and 
T 13773) growth, nor is there any hemolytic activity. The activity of this peptide is specific - 
shorter versions of the peptide (209-220, 224-237, 230-237, 233-237) or non-phosphorylated 
PEAP 2 o9-239 exhibited little to no bacterial growth inhibiting activity. 

Bovine periarthritis abscess fluid contains different forms of PEAP (72-237/239; 80- 
[ 5 237/239) as identified by immunoreactivity and confirmed by sequence analysis. These peptides 
have activity against M. luteus, but are less active than enkelytin (5 versus 0.2 uM). These PEAP 
constitute a pool of precursors which have to be processed, during infection, to provide active 
enkelytin. Presence of a PEAP at a molecular mass corresponding to that of PEAP 20 9. 2 37 was 
detected as well. PEAPs (PEAP 202 . 23 8 and PEAP 206 -237) have also been detected in wound fluids, 
10 including bovine post-caesarean abscess in the subcutaneous lining, and an abscess induced by 
subcutaneous injection of complete Freund's adjuvant. Therefore, these peptides are present in 
wound fluids along with other known antibacterial peptides (defensins, bactenecins). The 
concentrations were in a range similar to that found to be active in vitro (0.5-1 uM). The PEAPs 
have also been detected in secretions from human polymorphonuclear neutrophils. 
5 The PEAP209-230 and enkelytin are secreted from cultured chromaffin cells following 

stimulation. This suggests that these two peptides are co-released with catecholamines in stress 
situations and may therefore play an important role in defense mechanisms. 

Co-release of met-enkephalin and enkelytin would represent a unified neuroimmune 
protective response to stress situations that may be accompanied with infectious diseases. This 
0 would provide a highly beneficial survival strategy at the very begninning of proinflammatory 
processes. This protein would therefore play an important role in host defense against microbial 
infections, especially those involving gram positive bacteria. Due to their nonspecific activity on 
membranes, the antibacterial peptides possess cytotoxic activities and may not only play a role in 
antimicrobial defense, but also in inflammatory processes, possibly in wound repair. 
5 The protein of SEQ ED NO:279, peptides derived by cleavage thereof or fragments thereof 

could be used as antibacterial agents in creams/ointments/solutions, presoaked bandages, or dermal- 
type patches for external applications. Alternatively, the protein of SEQ ID NO:279, peptides 

337 



WO 01/42451 PCT/IB00/01938 

derived by cleavage thereof, or fragments thereof may be used in injections (intravenously, 
subcutaneously or intra-peritoneally). This is useful for wound repair, bum healing, post-operative 
recovery management. 

Alternatively, the protein of SEQ ID NO:279, peptides derived by cleavage thereof, or 
5 fragments thereof, may be incorporated into disinfectant solutions used for cleaning surfaces such 
as in the the house (kitchen, bathroom) or in the office (desktops, phones, computer keyboards and 
mouse). Other applications are as additives in mouthwash or handi -popup wipes. 

Altered levels of enkephalins may produce psychological disease. Konig et al (Nature, 383, 
535-538, 1996, the disclosure of which is incorporated herein by reference in its entirety) used a 

10 genetic approach to study the role of the mammalian opioid system. They disrupted the pre- 
proenkephalin gene using homologous recombination in embryonic stem cells to generate 
enkephalin-deficient mice. Mutant enk -/- animals are healthy, fertile, and care for their offspring, 
but display significant behavioral abnormalities. Mice with the enk -/- genotype are more anxious 
and males display increased offensive aggressiveness. Mutant animals show marked differences 

15 from controls in supraspinal, but not in spinal, responses to painful stimuli. These enk -/- mice do 
however exhibit normal stress-induced analgesia. Therefore, enkephalins modulate responses to 
painful stimuli. Thus, genetic factors may contribute significantly to the experience of pain. This 
study clearly indicates the importance of enkephalins in pain perception, anxiety and 
aggressiveness. 

!0 Interestingly, the PENK gene is localized on 8q23-q24, the same locus on which are found 

genes related to epilepsy and spastic paraplegia, disorders related to brain dysfunction. 

Accordingly, the protein of SEQ ID NO:279 or fragments thereof may be used for the 
treatment of psychological disorders, especially those involving distortion in the perception of pain, 
aggressiveness, or anxiety. This would include drug addiction, different types of phobias, panic 

5 attacks, schizophrenia, bi-polar, anorexia nervosa, chronic pain disorders, post-traumatic events, 
post-operative pain management. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:279, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

0 ameliorate a condition in an individual. For example, the condition may be cancer, a condition 
resulting from increased or decreased cellular proliferation, bacterial infection, conditions resulting 
from abnormal immune responses, psychological disease or any of the conditions listed above. In 
such embodiments, the protein of SEQ ID NO:279, or a fragment thereof, is administered to an 
individual in whom it is desired to increase or decrease any of the activities of the protein of SEQ 

5 ID NO.-279. The protein of SEQ ID NO:279 or fragment thereof may be administered directly to 
the individual or, alternatively, a nucleic acid encoding the protein of SEQ ID NO:279 or a 
fragment thereof may be administered to the individual. Alternatively, an agent which increases the 
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activity of the protein of SEQ ID NO:279 may be administered to the individual. Such agents may 
be identified by contacting the protein of SEQ ID NO:279 or a cell or preparation containing the 
protein of SEQ TD NO:279 with a test agent and assaying whether the test agent increases the 
activity of the protein. For example, the test agent may be a chemical compound or a polypeptide 
5 or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:279 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:279 may be identified by contacting the protein of 
SEQ ID NO:279 or a cell or preparation containing the protein of SEQ ID NO:279 with a test agent 

10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

1 5 example, fetal brain, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:279 in the sample. For example, the protein of SEQ 
ID NO:279 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 

20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal 

25 brain. Alternatively, the level of the protein of SEQ ID NO:279 in a test sample may be measured 
by determining the level of RNA encoding the protein of SEQ ID NO:279 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 

30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:279, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:279 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:279 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ED NO:279 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:279. In such techniques, the level of the protein of SEQ ID NO:279 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO:279 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:279 which is associated 
with disease. 

10 Protein of SEQ ID NO: 293 (internal designation 181-16-1-0-G7-CS) 

The protein of SEQ ID NO: 293 has a high degree of homology with HSPC163 (Genbank 
accession number AF161512), the protein encoded by gene no: 93 (PCT/US99/17130) and the 
human cornichon protein TGAM77. SEQ ID NO: 293 is overexpressed in cancerous prostate, fetal 
brain and fetal kidney. 

15 The gene HSPC163 is one of three hundred cDNAs obtained from CD34+ hematopoietic 

stem / progenitor cell (HSPC) library (obtained from umbilical cord blood and adult bone marrow). 
HSPC163 has also been in identified in five hematopoietic cell lines: NB4 (granulocytic), HL60 
(granulocytic), U937 (monocytic), K562 (erythro-megakaryocytic), and Jurkat (T lymphocytic). 
These cell lines represent the distinct lineages of hematopoietic cells. 

20 The polypeptide of gene no: 93 has been determined to have two transmembrane domains 

and a short cytoplasmic tail. Based upon these characteristics, it is believed that the protein product 
of gene no: 93 shares structural similarity to type Ilia membrane proteins. This gene is expressed 
primarily in activated T-cells and to a lesser extent in endometrial tumor, T cell helper II cells, 
microvascular endothelial cells, Raji cells treated with cyclohexamide and umbilical vein 

25 endothelial cells. The expression pattern of gene no: 93, indicates a role in regulating the 
proliferation, survival, differentiation, anaVor activation of hematopoietic cell lineages, including 
blood stem cells. The gene product appears to be involved in the regulation of cytokine production, 
antigen presentation, and other immune processes, suggesting a usefulness in boosting the immune 
system. The translation product of this gene has high homology to the human TGAM77 and mouse 

30 cornichon proteins. 

TGAM77 was identified as a gene involved in early phase of T-cell activation in response 
to alloantigens. Twenty four hours after T-cell allostimulation, RNA expression of TGAM77 is 
significantly increased. TGAM77 has been designated as a T-cell growth associated molecule. 
TGAM77 is a human homolog of cornichon (cm) protein of the fruit fly Drosophila. 

35 Cornichon was demonstrated to be involved in carefully orchestrated signaling events 

during Drosophila oogenesis establishing an asymmetric pattern in the oocyte as a prerequisite for 
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correct embryogenesis. Cornichon signaling functions in concert with two other proteins. The 
function of all three genes in an EGF-like signaling pathway appears to direct the formation of a 
correctly polanzed microtubule cytoskeleton, which is thought to be the basis for the correct spatial 
localization of other singaling molecules essential for oocyte polarization, asymmetric movement 
5 of the nucleus, and embryo differentiation. 

The subject invention provides the amino acid sequence of SEQ ID NO: 293 and 
polynucleotide sequences encoding the ammo acid sequence of SEQ ID NO: 293 . In one 
embodiment, the polypeptides of SEQ ID NO: 293 are interchanged with the corresponding 
polypeptides encoded by the human cDNA of clone 18 1-16-1 -0-G7-CS. Also included in the 
10 invention are biologically active fragments of SEQ ID NO: 293 and polynucleotide sequences 
encoding these biologically active fragments. "Biologically active fragments" are defined as those 
peptide or polypeptide fragments of SEQ ID NO: 293 which have at least one of the biological 
functions of the full length protein (e.g., the ability to stimulate T-cell proliferation). 

The invention also provides variants of SEQ ID NO: 293 . These variants have at least 
15 about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 293. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 293, 
such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 
20 practiced utilizing SEQ ID NO: 293 or variants thereof. Likewise, the methods of the subject 
invention can be practiced using biologically fragments of SEQ ID NO: 293, or variants of said 
biologically active fragments. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode SEQ ID NO: 293 . It is well within the skill of a person trained in the art to create these 
25 alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same" sequence refers to sequences that have amino 
acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: 293 are also 
30 included in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
35 viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

SEQ ID NO: 293 protein, and variants thereof, can be used to produce antibodies according 
to methods well known in the art. The antibodies can be monoclonal or polyclonal. Antibodies can 
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also be synthesized against fragments of SEQ ID NO: 293 as well as variants of SEQ ID NO: 293 
according to known methods. The subject invention also provides antibodies which specifically 
bind to biologically active fragments of SEQ ID NO: 293 or biologically active fragments of SEQ 
ID NO: 293 variants. 

5 The subject invention also provides for immunoassays which are used to screen for, 

monitor, or diagnose prostate cancer. Methods of screening for, diagnosing, identifying, or 
monitoring the course of prostate cancer are well known to those skilled in the art. In this aspect of 
the invention, immunoassays are provided which contact a biological sample (e.g., blood, serum, 
tissue, or biopsied tissue sample) with antibodies which specifically bind to SEQ ID NO: 293 , 

10 immunogenic fragments of SEQ ID NO: 293 , or biologically active fragments of SEQ ID NO: 293 
. Immunocomplexes formed in the contacting step are then detected using an appropriately labeled 
detection reagent. The levels of SEQ ID NO: 293 expressed in the tested biological samples are 
compared to control /normal levels typically observed in the population. 

Alternatively, methods which screen for, monitor, or diagnose prostate cancer may be 

15 practiced with SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 , as well as nucleic acids 
encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 . In one embodiment, the 
polypeptide may be used as a standard/control immunoassays described above. In another 
embodiment, the nucleic acids encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 are 
used in hybridization assays, well known to the skilled artisan, to identify biological samples (e.g., 

20 blood, serum, tissue, or biopsied tissue sample) which contain SEQ ID NO: 293 . The levels of 
SEQ ID NO: 293 expressed in the tested biological samples are compared to control/normal levels 
typically observed in the population. 

In another embodiment, SEQ ID NO: 293 , and polynucleotide sequences encoding the 
amino acid sequence of SEQ ID NO: 293 can be used to identify or diagnose immune disorders 

25 involving activated T-cells using standard hybridization assays. 

Another aspect of the invention provides methods of immunostimulating a mammal. In this 
aspect of the invention, SEQ ID NO: 293 , and/or polynucleotide sequences encoding the amino 
acid sequence of SEQ ID NO: 293 , are introduced into T-cells according to well known methods. 
T-cells are, then activated by stimulation with antigen to induce the immune system of the mammal. 

30 In another embodiment, autologous T-cells are obtained from an individual. SEQ ID NO: 

293 , biologically active fragments thereof, and/or polynucleotide sequences encoding the amino 
acid sequence, or biologically active fragments, of SEQ ID NO: 293 , are introduced into these 
autologous T-cells according to well known methods. The T-cells are expanded and reintroduced 
into the individual from which the T-cells were obtained. Sec, for example U.S. Patent Nos. 

35 5,192,537 and 5,766,920 , hereby incorporated by reference in their entirety. 

In another embodiment of the subject invention, polynucleotides and polypeptides 
encoding SEQ ID NO: 293 , can be used to expand stem cells, committed progenitors of various 
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blood lineages, and in the differentiation and/or proliferation of various cell types. In this aspect of 
the invention, polynucleotides and polypeptides encoding SEQ ID NO. 293 are introduced into the 
cells and the cells cultured. These methods may be practiced according to methods well known to 
the routineer. 

5 Protein of SEQ ID NO:316 (internal designation 1 88-45-1 -0-D9-CS) 

The protein of SEQ rD NO:3 16 is encoded by the cDNA of SEQ ID NO;75. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO: 3 16 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
acid included in clone 1 88-45- 1-0-D9-CS. In addition, it will be appreciated that all characteristics 

10 and uses of the nucleic acid of SEQ ID NO:75 described throughout the present application also 
pertain to the nucleic acid included in clone 1 88-45-1 -0-D9-CS. 

The protein of SEQ ID NO:3 16 is expressed in brain and contains three membrane- 
spanning segments located between amino acid positions 6 and 26, 73 and 93, or 139 and 159 and a 
signal peptide comprising the sequence FAAFCYMLSLVLC/AA. Accordingly, one embodiment 

15 of the present invention is a polypeptide comprising one or more of the membrane-spanning 
segments, and/or the signal peptide. 

The protein of SEQ ID NO:316 is a member of the comichon protein family. It has 48% 
identity with the Drosophila melanogaster cornichon protein as well as 67% identity with the 
Human Cornichon homolog TGAM77 (Genbank accession No. AF104398, the disclosure of which 

20 is incorporated herein by reference in its entirety), 67% identity with hCornichon, a bone marrow 
secreted protein (PCT publication WO/9933979, the disclosure of which is incorporated herein by 
reference in its entirety), 67% identity with a human secreted protein encoded by gene 24 (PCT 
publication WO/9910363, the disclosure of which is incorporated herein by reference in its entirety) 
and 67% identity with the protein product of the mouse cnih gene. However, this protein has higher 

25 homology, 81% identity, to the mouse cornichon-like protein (Genbank accession No. AB006191 , 
the disclosure of which is incorporated herein by reference in its entirety), which is the product of 
the mouse cnil gene. Finally, the protein of SEQ ID NO: 3 16 has a high level of identity with 
human secreted protein encoded by gene 95 (GSP: Y7621 8, PCT publication WO/9958660, the 
disclosure of which is incorporated herein by reference in its entirety) and is likely a polymorphic 

30 vanent of gene 95. The high degree of sequence conservation between the members of this family 
indicates that they are under strong selective pressure and are likely involved in important cellular 
functions. 

The Drosophila cornichon (cni) gene product is involved in signaling processes necessary 
for both anterior-posterior and dorsal-ventral pattern formation during Drosophila embyrogenesis 
35 (Cell, 1995, 81:967-978). Mutations in cornichon prevent the formation of a correctly polarized 
microtubule cytoskeleton in the oocyte. Cni signaling functions in concert with two other proteins. 
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Gurken, which is a protein secreted from the oocyte containing a single epidermal growth factor 
(EGF) motif most similar in structure to vertebrate TGFa, is considered to be the Hgand of the 
Drosophila epidermal growth factor receptor (DER) homolog torpedo, which is expressed by the 
follicular epithelium. The function of all three genes in an EGF-like signaling pathway appears to 
5 direct the formation of a correctly polarized microtubule cytoskeleton, which is thought to be the 
basis for the correct spatial localization of other signaling molecules essential for oocyte 
polarization, asymmetric movement of the nucleus, and embryo differentiation. TGAM77, one of 
the human homologs of cornichon, is differently expressed in alloactivated T-cells (Bioch. Biophys. 
Acta 1999, 1449:203-210, the disclosure of which is incorporated herein by reference in its 
10 entirety). Since there is a well-known involvement of the microtubule cytoskeleton in spatial 
polarization of signaling events in T-cell activation, it is thought that TGAM77 may function in a 
protein -tyrosine kinase pathway required for the vectorial localization of signaling molecules in T- 
cell activation. 

The protein of SEQ ID NO:316 is found in brain tissue, and gene 95 (GSP:Y76218, PCT 

15 publication WO/9958660, the disclosure of which is incorporated herein by reference) is expressed 
in infant brain tissue, endometrial tumor tissue and fontal cortex tissue. ESTs matching this gene 
are also found in lung tissue, germ cell tumors and skin melanomas. This is similar to the 
expression pattern of the murine cnil gene, which is found in 6.5-day whole embryos, 1 1.5-day limb 
bud, 13.5-day whole embryo, adult lung and brain (Dev. Genes Evol., 1999, 209:120-125, the 

20 disclosure of which is incorporated herein by reference in its entirety). 

Polynucleotides encoding the protein of SEQ ED NO;316 or fragments thereof and 
polypeptides comprising the protein of SEQ ID NO:316 or fragments thereof are useful as reagents 
for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions which include, but are not limited to, endometrial tumor, and 

25 neural and developmental diseases and/ or disorders. Similarly, the protein of SEQ ID NO:3 16 or 
fragments thereof and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and reproductive organs, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain tissues or cell 

30 types (e.g., neural, reproductive, cancerous and wounded tissues) or bodily fluids (e.g. lymph, 

serum, plasma, unne, amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in infant brain tissue and adult brain tissue, as well as the homology 

35 to cornichon proteins, indicates that polynucleotides encoding the protein of SEQ ID NO:316 or 
fragments thereof and polypeptides comprising the protein of SEQ ID 7MO:316 or fragments thereof 
are useful for detecting and/or treating neural and developmental disorders. The tissue distribution 
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indicates that these polynucleotides and polypeptides are useful for the detection/treatment of 
neurodegenerative disease states and behavioural disorders such as Alzheimers Disease, Parkinsons 
Disease, Huntingtons Disease, Tourette Syndrome, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder, panic disorder, learning disabilities, ALS, Psychoses, autism, and 
5 altered behaviors, including disorders in feeding, sleep platterns, balance, and perception. In 
addition, the gene or gene product may also play a role in treatment and/or detection of 
developmental disorders associated with the developing embyo, or sexually-linked disorders, 

Elevated expression of the protein of SEQ ID NO:3 16 within the brain suggests that it may 
be involved in neuronal survival, synapse formation, conductance, neural differentiation, etc. Such 

10 involvment may impact many processes, such as leanng and cognition. Alternatively, the tissue 
distribution in endometiral tumor tissue, germ cell tumors and skin melanomas indicates that the 
translation product of this gene is useful for the detection and/or treatment of endometrial tumors 
and/or reproductive disorders, as well as tumors of other tissues where expression of this gene has 
been observed. Furthermore, the protein of SEQ ID NO: 3 1 6 or fragments thereof may also be used 

1 5 to determine biological activity, to raise antibodies, as a tissue marker, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. The protein of SEQ ID NO: 3 16 or fragments thereof, as well as, antibodies directed 
against the protein may be used as tumor marker and/or immunotherapy targets for the above listed 
tissues. 

20 The gene encoding the protein of SEQ ID NO:316 is thought to reside on chromosome 1 1 . 

Accordingly, polynucleotides encoding the protein of SEQ ID NO: 3 1 6 or fragments thereof are 
useful as a marker in linkage analysis for chromosome 1 1 . 

Accordingly, the present invention includes the use of the protein of SEQ ID NO: 3 16 , 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

25 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be an abnormality in 
development, a signaling pathway, microtubule construction, neuronal survival, synapse formation, 
conductance, neuarl differentiation, or it may be cancer or an abnormality in any of the functions 
listed above. In such embodiments, the protein of SEQ ID NO:316, or a fragment thereof, is 

30 administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:316. The protein of SEQ ID NO:316 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO: 3 16 or a fragment thereof may be administered to the individual. Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO:316 may be administered to the 

35 individual. Such agents may be identified by contacting the protein of SEQ ID NO:316 or a cell or 
preparation containing the protein of SEQ ID NO:316 with a test agent and assaying whether the 
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test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:3 16 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:316 may be identified by contacting the protein of 
SEQ ID NO:3 16 or a cell or preparation containing the protein of SEQ ID NO:316 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, or to distinguish between two or more possible sources of a sample on the basis of 
the level of the protein of SEQ ID NO:3 16 in the sample. For example, the protein of SEQ ID 
NO:316 or fragments thereof may be used to generate antibodies using any techniques known to 

15 those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 
has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
maybe detectably labeled, under conditions which facilitate antibody binding. The level of 

20 antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 
Alternatively, the level of the protein of SEQ ID NO:316 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO:316 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 

25 Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 

amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 
to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 

30 used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO: 3 1 6, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:3 16 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:316 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 

35 support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 
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In another embodiment of the present invention, the protein of SEQ ID NO;316 or a 
fragment thereof may be used to diagnose disorders associated with altered expression of the 
protein of SEQ ID NO:316. In such techniques, the level of the protein of SEQ ID NO:316 in an ill 
individual is measured using techniques such as those described herein. The level of the protein of 
5 SEQ ID NO:316 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ID NO:316 which is associated with 
disease. 

Protein of SEQ ID NO:255 (1 06-037- l-0-E9-CS.cor) 

The protein of SEQ ID NO:255, encoded by the cDNA of SEQ ID NO: 14, is strongly 

10 expressed in the liver and testis and shows extensive homology to human lactate dehydrogenase-A 
protein (LDH -A or M chain) (Chung F.Z. et al., Biochem. J. 231:537-541(1985); SwissProt 
accession number P00338). The protein of SEQ ID NO:255 is also homologous to lactate 
dehydrogenase A from many vertebrates. The 381 -amino-acid-long protein of SEQ ID NO:255 
displays a Prosite motif corresponding to lactate dehydrogenase from positions 71 to 380. In 

15 addition, the active site LGEHGDS, where H is the active site residue, is present in the protein of 
the invention (positions 239 to 245). The protein of the invention also contains an additional 50 N- 
terminal amino acids not found in other lactate dehydrogenase A proteins. This N-termimal 
extension contains a signal peptide (cleavage site at position 34 of the protein of invention) that may 
allow the export of the protein to the extracellular domain or define a particular subcellular 

20 localization. Alternatively, the initiation start codon could be at position 26 or 50 of the protein of 
SEQIDNO:255. 

Lactate dehydrogenase (LDH) is an enzyme which dehydrogenates lactic acid into 
pyruvic acid in conjunction with the hydrogen acceptor NAD+, and which exists in a wide 
variety of animal tissues and microorganisms as an enzyme serving to produce lactic acid 

25 from pyruvic acid in the glycolytic pathway (Abad-Zapatero C. et al. J. Mol. Biol. 198:445- 
467(1987)). It is known that in vertebrates there are three isozymes of LDH: the M form 
(LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart 
muscle, and the X form (LDH-C), found only in the spermatozoa of mammals and birds. 
In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as 

30 epsilon-crystallin (Hendriks W. et al. Proc. Natl. Acad. Sci. U.S.A. 85:71 14-71 18(1988)). 

LDH has been used extensively in the field of clinical test reagents for a number of 
purposes. For example, it has been used as a coupling enzyme to determine the enzymatic 
activity of various amino-transferases, such as alanine aminotransferase (ALT), which is 
ultimately detected by UV spectrometry of the produced pyruvic acid. This use of LDH 

35 has been widely adopted as a clinical test, because amino-transferases are enzymes which 
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show high activity in liver, heart, kidney, etc. and show remarkable increases in serum in 
association with various diseases. LDH has also been used as a coupling enzyme to help 
determine the level of substrates such as urea, as the enzyme promotes the conversion of 
such substances into pyruvic acid which can be detected by UV spectrometry. 
5 Lactate dehydrogenase is also a widely used marker for heart disease and other 

conditions. For example, levels of LD-1 are elevated in the presence of myocardial 
infarction and in other conditions such as leukemia. Levels of lactate dehydrogenase start 
to increase 24 to 48 hours after occlusion of the coronary artery, peak in 3 to 6 days, and 
return to normal in 8 to 14 days. In addition, levels of LD-1 are elevated 10 to 12 hours 

10 after the acute myocardial infarction, peak in 2 to 3 days, and return to normal in 
approximately 7 to 10 days. Thus, measurement of the level of lactate dehydrogenase 
allows a prolonged retrospective diagnosis of myocardial infarction. Further, while the 
amount of LD-2 in the blood is usually higher than the amount of LD-1, patients with acute 
myocardial infarction have more LD-1 than LD-2. This "flipped ratio" usually returns to 

15 normal in 7 to 10 days. An elevated level of LD-1 with a flipped ratio has a sensitivity and 
specificity of approximately 75% to 90% for detection of acute myocardial infarction. 

Elevated LDH levels have also been used as a prognostic indicator for cancers such 
as small cell lung carcinoma. Specifically, elevated levels of LDH indicate a poor 
prognosis for such diseases (Kawahara, et al., (1997) Jpn J Clin Oncol. 1997 Jun;27(3): 158- 

20 65). 

LDH expression in cells has also been shown to be induced by interleukin-1 alpha, a 
major cytokine associated with, e.g., inflammation (Nehar et al. (1998) Biol Reprod 
Dec;59(6):1425~32). 

Islet beta-cells express low levels of lactate dehydrogenase and have high glycerol 
25 phosphate dehydrogenase activity. The effects on glucose metabolism and insulin secretion 
of acute overexpression of the skeletal muscle isoform of lactate dehydrogenase (LDH)- A 
in these cells have been studied by Ainscow EK et al. (Diabetes 2000 Jul;49(7):l 149). The 
results of these studies have shown that overexpression of LDH activity interferes with 
normal glucose metabolism and insulin secretion in islet beta cells, and it may therefore be 
30 directly responsible for insulin secretory defects in some forms of type 2 diabetes. These 
results also reinforce the view that glucose-derived pyruvate metabolism in the 
mitochondria is critical for glucose-stimulated insulin secretion in beta cells. Other data 
show that an overexpression of lactate dehydrogenase A attenuates glucose- induced insulin 
secretion in stable MIN-6 beta-cell lines, which normally express low levels of L-lactate 
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dehydrogenase (Zhao C, Rutter GA FEBS Lett. 1998 Jul 3;430(3):213-6). Low LDH 
activity thus appears to be important in beta-cell glucose sensing. 

Analysis of the LDH isoenzyme pattern in CSF fluid has also been shown to be helpful in 
the evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. 
5 Cancer. 2000 Apr 1 ;88(7): 1599-604). 

It is believed that the protein of SEQ ID NO:255 is a lactate dehydrogenase protein, most 
likely of the LDH-A or M subtype. The activity of the present protein can be assessed using any 
standard method for detecting lactate dehydrogenase enzyme activity, including those involving the 
UV detection of pyruvate, a product of LDH-catalyzed enzymatic reactions. 
10 In one embodiment, the polypeptides and polynucleotides of the invention are used to detect 

testis and liver tissue, as well as cells derived from these tissues. For example, nucleic acids and 
proteins of the invention can be labeled isotopically or chemically, using methods known to those 
skilled in the art, and used as probes in northern blots, far-western blots and in situ hybridization 
experiments. An ability to detect specific cell types is useful, e.g. for the determination of the 
1 5 history of tumor cells, as well as for the identification of cells and tissues for histological studies. 

In another embodiment, the present protein can be used in any of a variety of clinical assays 
involving LDH enzymes. For example, the protein can be used as a coupling enzyme to determine 
the enzymatic activity of various amino-transferases, such as alanine aminotransferase (ALT), as 
detected by UV spectrometry of the produced pyruvic acid. Such assays have significant clinical 
20 utility, as amino-transferases are enzymes which show high activity in liver, heart, kidney, etc. and 
show remarkable increases in serum in association with various diseases. The protein of the 
invention can also be used as a coupling enzyme to help determine the level of substrates such as 
urea, as the enzyme promotes the conversion of such substances into pyruvic acid which can be 
detected by UV spectrometry. 
25 In another embodiment, the present protein can be used to identify ingredients for cosmetic 

formulations. Specifically, enhancers of lactate dehydrogenase can be included in cosmetic 
compositions to stimulate keratinocyte proliferation and collagen synthesis in cutaneous tissues. 
The inhibitors can be combined with other active ingredients such as pyruvic acid, acetic acid, 
acetoacetic acid, beta-hydroxybutyric acid, Krebs cycle pathway metabolites, aliphatic saturated or 
30 unsaturated fatty acids containing from 8 to 26 carbon atoms, omega-hydroxy acids containing from 
22 to 34 carbon atoms, glutamic acid, glutamine, valine, alanine, leucine, and mixtures thereof (see, 
e.g., US Patent 5,853,742, the disclosure of which is hereby incorporated by reference in its 
entirety). 

In another embodiment, the present invention provides methods for treating or preventing 
35 cancer, e.g., by inhibiting lactate dehydrogenase activity in cells, preferably specifically the cancer 
cells, of a patient. The expression or activity of lactate dehydrogenase can be inhibited using any of 
a large number of agents, including, but not limited to, antibodies, antisense molecules, ribozymes, 
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and heterologous molecules that inhibit the expression or activity of the lactate dehydrogenase in 
the cancer cells of the patient. In one embodiment, lactate dehydrogenase that has been obtained 
from a primate, or anti-lactate dehydrogenase antibodies obtained from a mammal as a result of the 
parenteral administration of primate lactate dehydrogenase to said mammal, is parenterally 
5 administered to human cancer patients. Antibodies derived from the protein of the invention or part 
thereof can also be used to inhibit cancer cell development as described in US Patent No. 4,620,972. 

Analysis of the LDH isoenzyme pattern in CSF fluid has been shown to be helpful in the 
evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. Cancer. 
2000 Apr 1; 88(7): 1599-604). Thus, in another embodiment, the protein of SEQ ID NO:255 can be 
10 used to develop assays to monitor the LDH isoenzyme activity in CSF fluid, thereby improving the 
sensitivity of CSF cytology. This assay may be derived, e.g., from the methods described by Short 
S. et al. (J Biol Chem. 2000 Apr 28;275(1 7): 1 2963-9). 

In another embodiment, the protein of SEQ ID NO:255 is used to detect and/or treat insulin 
secretory defects in some forms of type 2 diabetes. For example, various evidence indicates that 
[5 LDH overexpression may be involved in certain types of diabetes. Therefore, the detection of an 
elevated level of LDH in a patient, e.g. in pancreatic islet cells of a patient, can be used as an 
indication that the patient has diabetes, or is at risk of developing diabetes. Similarly, methods of 
inhibiting the expression or activity of LDH in those cells, e.g. using antibodies, antisense 
sequences, or heterologous compounds that inhibit the expression or activity of LDH, can be used to 
!0 treat or prevent diabetes. 

In another embodiment, the protein of the invention can be used to eliminate endogenous 
pyruvic acid in cells in vitro or in vivo. 

In another embodiment, the expression of the present protein is used as a marker for 
interleukin 1, e.g. IL-1 alpha, activity in cells or in a patient. Specifically, as it has been shown that 
5 LDH expression is induced by IL-1 alpha, then the expression, or elevated expression, of the 
present protein can be used as a marker for the action of IL-1 on the cell. As IL-1 has been 
implicated in a number of physiological processes, including inflammation and more specifically in 
deleterious processes such as arthritis and autoimmune disorders, the present protein can serve as a 
marker for the presence of such disorders, or for a predisposition for the disorders. 
0 In another embodiment, the present protein is used to detect heart disease and other 

diseases in patients. For example, levels of LDH are known to rise following myocardial 
infarction and other heart ailments. Accordingly, the detection of an elevated level of the 
protein of the invention, alone or in view of the levels of other proteins such as other LDH 
isozymes, can be used as an indicator of a heart attack or other diseases, including 
5 leukemia. The levels of LDH can be assessed in any tissue or biological sample, including, 
but not limited to, serum, and can be detecting using any standard method, including, but 

not limited to, immunoassays and assays for LDH enzyme activity. 
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In another embodiment, the present protein is used to determine a prognosis for any 
of a number of diseases, including cancers such as small cell lung carcinoma. For example, 
the level of the present protein is detected in the serum of a patient suffering from cancer, 
wherein the detection of a decreased level of expression or activity of the protein indicates a 
5 worse prognosis for the patient compared to the prognosis in a patient with a normal level 
of the protein activity or expression. 

Proteins of SE Q IP NOs:243. 253 (internal designation numbers 105-016-1-0-D3-CS and 105^095- 
2-0-G11-CS) 

The 331-amino-acid- long protein of SEQ ID NO:243, encoded by the cDNA of SEQ ID 
10 NO:2, is found in prostate and in fetal brain and is homologous to a secreted human protein (Genseq 
accession number Y59685). In addition, this protein is highly homologous to the the putative 
glycerophosphodiester phosphodiesterase (GP-PDE) MIR 16 (Membrane Interacting protein of 
RGS16) protein (SPTREMBLNEW SPTREMBL SWISSPROT accession number AAF65234) 
encoded by the cDNA of GENPEPT GENPEPTNEW accession number AF2 12862; in fact, the 
1 5 protein of the invention is a likely variant of the MIR 16 protein. Furthermore, a BLAST search 
with the amino acid sequence of SEQ ID NO:243 indicates that the protein of the invention is 
homologous to GP-PDEs of E.coli (SWISSPROT accession numbers P09394 and P10908) and 
Haemophilus influenzae (SWISSPROT accession number Q06282). The protein of SEQ ID 
NO:243 displays 2 candidate membrane-spanning segments, from amino acids 7 to 27 and 258 to 
20 278, and a putative signal peptide from amino acids 19 to 24. Finally, the protein of the invention 
has two putative N-glycosylation sites: asparagine residues at positions 168 and 198 (Zheng et a/., 
Proc. Natl. Acad. Sci. 97 :3999-4004 (2000)). 

The cDNA of SEQ ID NO:2 differs from the cDNA of GENPEPT GENPEPTNEW 
accession number AF212862 by its extended 5' and 3' termini, and from the cDNA of SEQ ID 
25 NO: 12 by polymorphisms and alternate splicings. 

The MIR 16 (Membrane Interacting protein of RGS16) protein, which is homologous to the 
protein of the invention, was identified in a yeast two-hybrid screen of a pituitary cell cDNA library 
using the RGS 16 (Regulator of G protein Signaling) protein as bait (Zheng et a!., Proc. Natl. Acad. 
Sci. 97:3999-4004 (1999)). and Sasaki, J. Bacteriol. 175:4569-4571 (1993); Zheng eta!., ibid.). 
30 Remarkably, the GP-PDE from Haemophilus influenzae (also called protein D) which is 67% 

identical to the penplasmic GP-PDE of E.coli, presents affinity for human immunoglobin D (Janson 
et al.. Infect. Immun. 62:4848-854 (1994)). 

From sequence alignments, it can be seen that the N-terminal region of MIR 16 (amino 
acids 70-150), immediately after the putative signal peptide, is highly conserved (40-61% 
35 similarity), suggesting that it may contain residues critical for catalytic activity, i.e., the catalytic 
site. GP-PDEs hydrolyze deacetylated phospholipid GPs, such as glycerophosphocholinc (GPC) 
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and glycerophosphoethanolamine, to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols 
(Zheng et aL, ibid.). The putative enzymatic activity of MIR 16 and its interaction with RGS16 
suggest that it may play important roles in lipid metabolism and in G protein signaling. As shown 
in northern blot experiments, the MIR16 mRNA is highly transcribed in heart, liver, kidney, testis 
5 and brain. The observed expression of M1R16 in the brain is consistent with the above-described 
expression of the protein of the invention in the fetal brain. 

It is believed that the proteins of SEQ ID NOs:243 and 253 or part thereof are members of 
the glycerophosphodiester phosphodiesterase protein family, interact with the RGS 16 protein and, 
as such, play important roles in both lipid metabolism and in G protein signaling. Preferred 
1 0 polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID NO:243 from 
positions 7 to 27, 19 to 24 and 258 to 278. Other preferred polypeptides of the invention are 
fragments of SEQ ID NO:243 or 253 having any of the biological activities described herein. 
Additional preferred polypeptides are those that comprise asparagine residues at positions 168 
and/or 198. 

15 The invention first relates to methods and compositions using cDNAs of SEQ ID NO:2 or 

12 or part thereof, and proteins of the invention SEQ ID NO:243 or 253 or part thereof to identify 
specific cell types, preferably from prostate or fetal brain. For example, nucleic acids and proteins 
of the invention are labeled isotopically or chemically following methods known to those skilled in 
the art, and further used as probes in northern blots, far-western blots and in situ hybridization 

20 detection experiments. An ability to detect specific cell types is useful, e.g. for the determination of 
the history of tumor cells, as well as for the identification of cells and tissues for histological 
studies. 

Any of a number of in vitro assays can be used to detect SEQ ID NO:243 or 253 protein 
activity, for example for in vitro screening of modulators of protein activity. Preferably cDNA 

25 encoding the protein of the invention is cloned in a prokaryotic expression vector, according to 
methods known to those skilled in the art. Briefly, the GP-PDE activity of the recombinant protein 
is analyzed by a coupled spectrophotometnc assay as described by Larson and collaborators and 
adapted by Cameron and collaborators (Larson et al., J. Biol. Chem. 258 :5426-5432 (1983); 
Cameron et al., Infect. Immun. 66 :5763-5770 (1998)). Such enzymatic activity may be measured 

30 in vitro in the presence of modulating drugs. 

Another embodiment of the present invention relates to methods of using the protein of the 
invention or part thereof to purify or specifically bind to human immunoglobin D. Several 
immunoglobin (Ig) binding bacterial cell wall proteins have been isolated and/or cloned during the 
last two decades. The best characterized of these are protein A of Staphylococcus aureus (which 

35 binds to human IgG subclasses 1, 2 and 4, IgG of several mammalians species, and in some 
instances human Ig of classes A, M, E), and protein G of group G beta-hemolytic streptococci 
(which binds to all human IgG subclasses and which also displays a wider binding spectrum for 
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animal IgG than protein A). IgD binds to neither protein A nor protein G. Consequently, it is of 
great interest to identify new proteins capable of binding IgD, thereby allowing its separation and 
purification. In addition, IgD binding proteins can also be used in immunoprecipitation procedures 
with IgD, as are routinely performed with proteins A and G in the case of IgG. The binding and 
5 purification of IgD using the protein of the invention can be accomplished in any of a number of 
ways, for example by generating a fusion protein or polypeptide in which the protein of the 
invention or part thereof, is combined with another protein by the use of a recombinant DNA 
molecule. The resulting fusion product including the protein of the invention or part thereof is then 
covalently, or by any other means, bound to a protein, carbohydrate or matrix (such as gold, 

10 "Sephadex" particles, polymeric surfaces). Such a complex is very useful for IgDs immobilization 
and consecutive immunoprecipitations in batch. Similar assays for binding of protein D (GP-PDE) 
of Haemophilus influenzae and IgD are described in the US Patent No. 6,025,484. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention, or part thereof, as GP-PDE enzymes to hydrolyze deacylated phospholipids (GPs), 

1 5 such as glycerophosphocholine (GPC) and glycerophosphoethanolamine, to sn-glycerol-3- 

phosphate (G3P) and the corresponding alcohols. First, this enzymatic activity, which belongs to 
the class of specific phospholipase D, makes the protein of the invention very useful to study 
biological membranes and their phospholipidic components. Moreover, as glycerophospholipids 
are major components of the lipidic bilayer, elimination of their hydrophilic moiety using the GP- 

20 PDE activity of the protein of the invention would likely modify the structure and consequently the 
permeability of eukaryotic cell membranes. Such modifications could improve the transfection 
efficiency of eukaryotic cells, in vitro or in vivo. Typically, in such embodiments the purified 
protein of SEQ ID NOs:243 or 253 is administrated to cells; purified proteins of the invention can 
be obtained in any of a number ways, for example by inserting the cDNA encoding the proteins into 

25 a prokaryotic expression vector using any technique known to those skilled in the art. The 

recombinant protein produced and purified in the prokaryotic system is then added to an in vitro 
culture of eukaryote cells before or during transfection. The recombinant protein of the invention 
can also be used to increase the efficiency of cell transfection in vivo, most notably in the case of 
gene therapy. For example, tumoral masses arc very often resistant to transfection, and the protein 

30 of the invention would likely provide an effective way to facilitate the introduction of cytotoxic 
genes (such as pro-apoptotic genes) or antitumoral drugs in solid tumors. 

Still another embodiment of the protein of the invention relates to methods and 
compositions to diagnose, treat, and prevent disorders associated with excess glutamate signaling in 
the brain. As described above, the MLR16 protein interacts physically with the RGS16 protein 

35 (Regulator of G protein Signaling 16). Receptors of many hormones use heterotrimeric G proteins 
for signal transduction after hgand binding (for a review, see Neer, Cell 80 :249-257 (1995)). 
Among these receptors are metabotropic glutamate receptors (mGluRs). These receptors, which are 

353 



WO 01/42451 PCT/IBOO/01938 

expressed in the brain, like the protein of the invention, are a novel family of cloned G-protein- 
coupled receptors (Schoepp and Conn, Trends Pharmacol. Sci. 14: 1 3-20 (1993)). Endogenous 
glutamate, by activating the mGluRl receptor (and also NMDA and AMPA receptors), may 
contribute to the brain damage occurring acutely after epilepsy, cerebral ischemia or traumatic brain 
5 injury. It may also contribute to chronic neuro degeneration in such disorders as amyotrophic lateral 
sclerosis and Huntington's chorea (Meldrum, J. Nutr. 130(4S Suppl): 1007S-10T5S (2000)). 

The invention thus relates to methods and compositions using cDNAs of SEQ ID NO:2 or 
12 or part thereof, and proteins of SEQ ID NO:243 or 253 or part thereof, to diagnose, treat, or 
prevent disorders associated with excess glutamate signaling in the brain. Specifically, the level of 

1 0 activity or expression of the proteins can be correlated with the level of glutamate signaling, or with 
the glutamate-signaling associated brain damage involved in epilepsy, cerebral ischemia, traumatic 
brain damage, ALS, or Huntington's chorea, or with any other G-protein associated physiological 
process or disease or condition. For situations where the level of the expression or activity of the 
protein is positively correlated with such signaling or with the presence of a disease or condition, 

1 5 the signaling, disease or condition can be detected using any of a number of tools for detecting 
protein expression or activity, including northern blots, far-western blots and in situ hybridization 
experiments, where an elevated level of the protein, protein activity, or nucleic acid of the invention 
indicates the presence of the disease, condition, or signaling process. Further, such diseases or 
conditions can be treated or prevented, or such signaling pathways can be inhibited, using 

20 compounds that inhibit the expression or activity of the protein, such as antibodies, antisense 

molecules, ribozymes, dominant negative forms of the protein, or any heterologous molecule that 
inhibits protein activity or expression. Alternatively, where the expression or activity of the protein 
of the invention is negatively associated with the signaling pathway, disease or condition, a 
detection of a decreased level of expression or activity of the protein can be used to indicate the 

25 presence of the disease, condition, or pathway. Further, in such cases, the disease or condition can 
be treated or prevented, or the pathway be inhibited, using any compound that increases the activity 
or level of the protein, such as nucleic acids encoding the protein, the protein itself, or heterologous 
compounds that cause an increase in the level of protein expression or activity. 

Protein of SEQ ID NO:386 (internal designation 105-037-4-O-H12-CS) 
30 The protein of SEQ ID NO:386, encoded by the cDNA of SEQ ID NO: 145, is strongly 

expressed in the fetal brain and uterus. The 207-amino-acid-long protein of SEQ ID NO. 386 

displays pfam SPRY domains from positions 85 to 205. 

SPRY domains have been found in a number of proteins involved in multiple cellular and 

developmental processes. For example, the Midline- 1/FXY family of proteins has been shown to 
35 associate with microtubules, and has been implicated in human diseases, such as Opitz Syndrome, a 

congenital disorder characterized by multiple developmental abnormalities (see, e.g., Cainarca, et 
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al., (1 999) Hum Mol Genet 8(8): 1387-96). In addition, the cytoplasmic Marenostrm/Pyrm protein 
has been demonstrated to be the cause of Familial Mediterranean fever, an autosomal recessive 
disorder characterized by fever and serosius (Nat Genet 1997 Sep;17(l):25-31). Other SPRY 
proteins include SplA, a serine protease from Staphylococcus aureus, and butyrophilin, a major 
milk protein. Another family of proteins known to contain the SPRY domain are the Ryanodine 
receptors (RyRs). 

Ryanodine receptors play an important role in Ca2+ signaling in muscle and non muscle 
cells by releasing Ca2+ from intracellular stores. For example, these receptors are centrally 
important in excitation-contraction (e-c) coupling, which occurs at specialized regions where the 
sarcoplasmic reticulum (SR), containing the ryanodin receptors, and the plasma 
membrane/transverse-rubule system form junctions. RyRs are also thought to play some role in 
maintaining the structural integrity of the SRT-tubule junctions. RyR is apparently unable to carry 
out the requisite functions associated with e-c coupling by itself, however, because it forms 
interactions with other macromolecules at the triad junction. For example, two small proteins, 
calmodulin and FKBP12, are believed to modulate RyR at the triad junction. 

It is believed that mammalian tissues express three different RyR isoforms, comprising four 
560-kDa (RyR polypeptide) and four 12-kDa (FK506 binding protein) subunits. It is believed that 
these large protein complexes conduct monovalent and divalent cations and are capable of multiple 
interactions with other molecules. The subunits of the protein complexes include small diffusible 
endogenous effector molecules including Ca2+, Mg2+, adenine nucleotides, suflrydryl modifying 
reagents (glutathione, NO, and NO adducts) and lipid intermediates, and proteins such as protein 
kinases and phosphatases, calmodulin, immunophilins (FK506 binding proteins), and in skeletal 
muscle the dihydropyridine receptor. The RyR from skeletal muscle is the major calcium release 
channel for that tissue, and the most intensively studied of the three genetic isoforms detected thus 
far in mammalian species. The other two RyR isoforms are often referred to as the 'heart 1 and 'brain' 
forms, but the actual cell and tissue distribution of the isoforms is complex. 

Because of their multiple ligand interactions, ryanodin receptors constitute an important, 
potentially rich pharmacological target for controlling cellular functions. Ca2+ release channel 
activity is modulated by many endogenous effectors, including Ca2+, ATP, Mg2+, and calmodulin. 
In addition, many exogenous effectors, including caffeine, local anesthesics, and polyamines, also 
modify channel activity. For example, tetracaine, procaine, benzocaine, and lidocaine inhibit Ca2+ 
release from the SR. They appear to interact with a specific site(s) located on the RYR, affecting 
both ryanodin-binding and single channel activities (Shoshan-Barmatz et al. 1993; J. Membr. Biol.; 
133; 171-181). 

The importance of intracellular calcium as a second messenger in cellular signal 
transduction processes is well established. Alterations in intracellular Ca2+ homeostasis have 
profound effects on many cell functions, including secretion, contraction-relaxation, motility, 
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metabolism, protein synthesis, modification and folding, gene expression, cell-cycle progression 
and apoptosis. A maj or source of cytoplasmic calcium is from intracellular storehouses located in 
the endoplasmic reticulum, or in muscle, within the sarcoplasmic reticulum (SR). 

Given that cellular Ca2+ handling is an important factor in the control of neuronal 
5 metabolism and electrical activity, abnormalities of intracellular Ca2+ channels might be expected 
to contribute to some forms of epilepsy or to anoxic brain damage following an episode of cerebral 
ischemia. Cell loss is said to be a characteristic feature of degenerative brain disorders, including 
Alzheimer's disease. It is well established that neuronal cell death may be secondary to an 
abnormal elevation of cytoplasmic Ca2+, particulary that associated with activation of excitatory 

10 glutamate receptors (e.g., in epilepsy). This strongly suggests that the release of stored Ca2+ 
contributes to nerve cell damage and cell death in various circumstances. 

It is believed that the protein of SEQ ID NO:386 is functionally related to other SPRY- 
containing proteins, such as the ryanodine receptors, Marenostnn/Pyrin, SplA, Midline- 1 /FX Y, and 
butyrophilin. Accordingly, it is thus believed that the present protein is associated with the release 

15 of Ca2+ from intracellular Ca2+-storing organelles, like the endoplasmic reticulum and, in muscle, 
the sarcoplasmic reticulum (SR), as well as being involved in microtubule binding. Preferred 
polypeptides of the invention are any fragments of SEQ ID NO: 386 having any of the biological 
activities described herein. 

In one embodiment, the present protein and nucleic acids can be used to specifically detect 

20 cells of the fetal brain and uterus, as the protein is overexpressed in these tissues. For example, the 
protein of the invention or part thereof may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, such as in forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 

25 section using immunochemistry. The protein can also be used to specifically label microtubules in 
cells. 

In another embodiment, the protein of the invention or part thereof may be used in 
regulating intracellular Ca2+ levels. As alterations in intracellular Ca2+ homeostasis have profound 
effects on many cell functions, including secretion, contraction-relaxation, motility, metabolism, 

30 protein synthesis, modification and folding, gene expression, cell-cycle progression and apoptosis, 
the ability to modulate intracellular Ca2+ levels provides a tool to alter any of these cellular 
functions, in vitro or in vivo. Such an ability has wide utility for a large number of applications, for 
example to manipulate the behavior (e.g. growth rate, secretion, survival, etc.) of cells grown in 
vitro, as well as to treat, prevent, or diagnose any of a number of diseases associated with altered 

35 Ca2+ signaling in vivo. The activity or expression of the protein of the invention can be modulated 
in any of a large number of ways, for example by administering to cells or to a patient the protein 
itself, a polynucleotide encoding the protein, antibodies, antisense sequences, dominant negative 
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forms of the protein, compounds that alter the expression or activity of the protein, etc. The effect 
of any such agent on calcium flux in cells can be detected using standard methods, including by 
studying the permeation of Ca2+ release through endoplasmic reticulum (ER) and sarcoplasmic 
reticulum (SR) channels using tracers, light scattering and fluorescence quenching, and channel 
5 reconstitution in planar bilayer. In addition, targeted recombinant photoproteins can provide direct 
measurements of organellar Ca2+ (Montero et aL; 1995; EMBO J.; 14, 5467-5475). 

The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which the activity or 
recognition of ryanodm receptors, is impaired or excessive. These disorders include, but are not 

10 limited to, neurodegenerative diseases, cardiovascular disorders, severe myasthenia, malignant 
hyperthermia, epilepsy, and central core disease. For example, in patients with severe myasthenia, 
the level of anti-RyR antibodies has been directly related to the severity of the disease (Skeie et aL, 
1996: Eur. J. Neurol. 3; 136-140). There is also some evidence to suggest that RyR abnormalities 
are a primary cause of many types of cardiac disease. In addition, the protein of the invention can 

1 5 be used to diagnose other diseases associated with SPRY -protein dysfunction, such as Familial 
Mediterranean fever and Opitz syndrome. Finally, as SPRY containing proteins have been 
implicated in embryonic development (e.g. the Midline 1 protein), the protein and nucleic acids of 
the invention can be used to detect developmental disorders, as the detection of a mutation in the 
gene encoding SEQ ID NO:386, or a detection of abnormal gene expression in a fetus, can be used 

20 to indicate the presence of a developmental abnormality. For example, as the protein of SEQ ID 
NO:386 is strongly expressed in the fetal brain, it is likely that the protein plays a role in the normal 
development of the brain in utero. 

The present invention also relates to diagnostic assays for detecting altered levels of the 
protein of SEQ ID NO: 3 86 in various tissues, as over-expression of the protein compared to normal 

25 control tissue samples can indicate the presence of certain disease conditions such as 

neurodegenerative disorders, cardiovascular disorders, svere myasthenia, malignant hyperthermia, 
epilepsy, and central core disease. Assays used to detect levels of the polypeptide of the present 
invention in a sample derived from a host are well-known to those of skill in the art and include 
radioimmunoassays competitive-binding assays, Western Blot analysis and ELISA assays. 

30 Proteins SEQ ID NOs:283 and 286 (internal designations 1 74-38-1 -0B6-CSJLA and 174-41-1-0- 
A6-CS LA) 

The protein of SEQ ID NO:283, encoded by the cDNA of SEQ ID NO:42, is overexpressed 
in salivary glands and to a lesser extent in bone marrow, and shows homology over the C-terminal 
length to the immunoglobin (Ig) protein superfamily, which is conserved among eukaryotes 
35 (including rabbit, rodents and human). In particular, the 468-amino-acid-long protein of the 
invention, which is similar in size to the constant chain of Ig related proteins, displays two pfam 
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conserved immunoglobulin domains, from position 205 to 285 and from position 3 1 8 to 384, which 
are known to be involved in the basic structure of the light and heavy constant chains of 
immunoglobins. It is known (Orr H.T., Nature 282:266-270(1979)) that the Ig constant chain 
domains and a single extracellular domain in each type of MHC chain are closely related, sharing 
5 over one hundred amino-acids of homology. All members of the Ig related superfamily, including 
the MHC class I alpha chain and beta-2-microglobulin, as well as the MHC class II alpha and beta 
chains, display the prosite conserved characteristic pattern around the C-terminal cysteine ([FY]-x- 
C-x-[VA]-x-H). This cysteine is involved in the disulfide bond between the light and heavy chains, 
and is also found in the protein of the invention (position 380 to 386). The protein of the invention 
0 also exhibits an emotif Ig and Major Histocompatibility Complex protein signature from positions 
3 19 to 336. In addition, the protein of the invention displays homology with tapasm (GeneBank 
No. AF009510), a chaperone-like protein closely associated with TAP-binding proteins, which is 
well conserved among eukaryotes (chicken, rodents and human). Tapasin has been shown to 
increase the efficiency of antigen processing and presentation by mediating the association of MHC 
5 complex proteins with TAP proteins to the endoplasmic reticulum and to the cell surface during 
immune response (for review see Abele, R. and Tampe, R., Bioch. et Biophysica Acta, 1999). In 
addition, the protein of the invention displays two transmembrane domains from positions 1 99 to 
219 and from positions 406 to 426 , a hydrophobic profile similar in amino acid position to the 
hydrophobic stretch of amino acids of human and mouse tapasin (Suling L., J. Biol. Chem., 
274:8649-8654, 1999), and a secreted signal peptide from position 9 to 23. Both signatures are 
largely present in Ig related proteins such as secreted antibodies or antigen presenting proteins. The 
invention also encompasses a variant (SEQ ID NO:286) of SEQ ID NO:283, encoded by the cDNA 
of SEQ ID NO:45. The protein of SEQ ID No:286 is a 442-amino-acid-long protein with a C- 
terminal shorter end of 26 amino-acids compared to the protein of SEQ ID NO:283. The variant of 
SEQ ID NO:286, which results from a frameshift (position 1445 in SEQ ID NO:45) in the coding 
sequence that leads to a stop codon in the corresponding protein, displays characteristics identical to 
those described above in terms of motifs, Ig signatures, function, and potential uses. 

The immunoglobulin (Ig) gene superfamily comprises a large number of cell surface 
glycoproteins that share sequence homology with the V and C domains of antibody heavy and light 
chains. These molecules function as receptors for antigens, immunoglobulins and cytokines as well 
as adhesion molecules, and play important roles in regulating the complex cell interactions that 
occur within the immune system (A. F. Williams et al., Annu. Rev. Immuno. 6:381-405, 1988, T. 
Hunkapilleret al., Adv. Immunol. 44:1-63, 1989; for a short review see also Prosite entry PS00290) 

The introduction of an antigen into a host initiates a series of events culminating in an 
immune response. In addition, self-antigens can result in immunological tolerance or activation of 
an immune response against self-antigens. A major portion of the immune response is regulated by 
presentation of antigen by major histocompatibility complex molecules. MHC molecules bind to 
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peptide fragments derived from antigens to form complexes that are recognized by T cell receptors 
on the surface of T cells, giving rise to the phenomenon of MHC-restricted T cell recognition. The 
ability of a host to react to a given antigen (responsiveness) is influenced by the spectrum of MHC 
molecules expressed by the host. Responsiveness correlates with the ability of specific peptide 
5 fragments to bind to particular MHC molecules. 

There are two types of MHC molecules, class I and class II, each of which comrise two 
chains. In class I [2], the alpha chain is composed of three extracellular domains, a transmembrane 
region, and a cytoplasmic tail. The beta chain (beta-2 -microglobulin) is composed of a single 
extracellular domain. In class II [3], both the alpha and the beta chains are composed of two 

10 extracellular domains, a transmembrane region and a cytoplasmic tail. MHC class I molecules are 
expressed on the surface of all cells, and MHC class II molecules are expressed on the surface of 
antigen presenting cells. MHC class II molecules bind to peptides derived from proteins made 
outside of an antigen presenting cell. In contrast, MHC class I molecules bind to peptides derived 
from proteins made inside a cell. In order to present peptide in the context of a class II molecule, an 

15 antigen presenting cell phagocytoses an antigen into an intracellular vesicle, in which the antigen is 
cleaved, bound to an MHC class II molecule, and then returned to the surface of the antigen 
presenting cell. 

Major histocompatibility complex (MHC) class I molecules present antigenic peptides to 
CD8 T cells (Townsend, A. et al., Nature: 3 40 ,44 3 -448)). The peptides are generated in the cytosol 

20 and then translocated across the membrane of the endoplasmic reticulum by the transporter 

associated with antigen processing (TAP). TAP is a trimenc complex consisting of TAP1, TAP2, 
and tapasin (TAP- A). TAP1 and TAP2 are required for the peptide transport. Tapasin mediates the 
interaction of MHC class I HC-beta-2 microglobulin with TAP, and this interaction is essential for 
peptide loading onto MHC class I HC-beta-2-microglobulin (Sulmg et al., J. Biol. Chem., 

25 274:8649-8654). T cell receptors (TCRs) are the second antigen recognition molecules, and 
recognize antigens that are bound by MHC molecules. Recognition of MHC complexed with 
peptide (MHC-peptide complex) by TCR can effect the activity of the T cell bearing the TCR. 
Thus, MHC-peptide complexes are important in the regulation of T cell activity and, thus, in 
regulating an immune response. 

30 Human cytomegalovirus (HCMV) is a betaherpesvirus which causes clinically serious 

disease in immunocompromised and immunosuppressed adults, as well as in some infants infected 
in utero or pennatally (Alford, C. A., and W. J. Britt. 1990. Cytomegalovirus, p. 1981-2010. In D. 
M. Knipe and B. N. Fields (ed.) : Virology, 2nd ed. Raven press, New York). In human 
cytomegalovirus (HCMV)-infected cells, expression of the cellular major histocompatibility 

35 complex (MHC) class I heavy chains is down-regulated, where down-regulation is defined as 

reduction in either synthesis, stability or surface expression of MHC class I heavy chains. A similar 
phenomenon has been reported for some other DNA viruses, including adenovirus, murine 
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cytomegalovirus, and herpes simplex virus (Anderson, M., et al., Cell 43:215-222, 1985; Burgert 
and Kvist, CelU 1:987-997, 1985; Heise T. M., et al, J. Exp. Med. 187:1037-1046, 1998). In the 
adenovirus and herpes simplex virus systems, the product of a viral gene which is dispensable for 
replication in vitro is sufficient to cause down-regulation of MHC class I heavy chains (Anderson, 
5 M., et al., 1985, supra). The gene(s) involved in class I heavy chain down-regulation by murine 
cytomegalovirus have not yet been identified. 

It is believed that the proteins of SEQ ED NOs:283 and 286 are members of the 
immunoglobulin superfamily and, as such, play a role in the immune response, cellular proteolysis, 
cell proliferation and differentiation, pathogen recognition, apoptosis, and other processes 
0 associated with the Ig superfamily. In addition, the proteins of the invention are thought to be 
tightly linked to the antigen processing and presentation system in the context of peptide assembly 
and translocation of foreign peptides across endoplasmic reticulum and cell surface membranes as 
new chaperonin-like proteins associated with MHC I and TAP proteins. The weak homology (30%) 
with the TAP protein family is thought to indicate the specificity of the interactions of the proteins 
5 of the invention with MHC proteins and/or TAP-related proteins, as described by Suling et al, 
supra. 

Preferred polypeptides of the invention are polypeptides comprising the amino acids of 
SEQ ID NO:283 from position 9 to 23, 199 to 2 19, 205 to 285, 3 1 8 to 384, 3 1 9 to 336, 380 to 386 
and from 406 to 426. Other preferred polypeptides of the invention are fragments of SEQ ID 
NO:283 having any of the biological activities described herein. 

In one embodiment, the invention relates to methods and compositions for using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, such as salivary 
glands and bone marrow tissues, which strongly express the protein of the invention. For example, 
the protein of the invention or part thereof may be used to synthesize specific antibodies using any 
techniques known to those skilled in the art including those described therein. Such tissue-specific 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. 

In another embodiment, the invention relates to methods for using the protein of the 
invention to visualize proteins and peptides involved in antigen recognition system within cells by 
virtue of their physical interaction with the proteins of the invention. For example, the protein may 
be used to detect the presence and/or the localization of MHC peptides and TAP- like proteins in a 
cell. The protein of the invention, and hence any interacting proteins, can be labeled using any of a 
number of methods, including by binding with specific antibodies or by creating a fusion protein 
comprising the protein of the invention as well as a readily detectable moiety, such as an epitope 
tag, biotm, or green fluorescent protein. 
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In another embodiment, polynucleotide or polypeptide sequences of the invention or part 
thereof may be used for the diagnosis of a disorder associated with a loss of regulation of the 
expression of the protein of the invention, preferably, but not limited to, deficiencies of the MHC 
protein system. Examples of such disorders include, but are not limited to, acquired 
5 immunodeficiency syndrome (AIDS), X-linked agammaglobinemia of Bruton, common variable 
immunodeficiency (CVI), DiGeorge's syndrome (thymic hypoplasia), thymic dysplasia, isolated 
IgA deficiency, severe combined immunodeficiency disease (SCID), immunodeficiency with 
thrombocytopenia and eczema (Wiskott-Aldrich syndrome), Chediak-Higashi syndrome, chronic 
granulomatous diseases, hereditary angioneurotic edema, immunodeficiency associated with 
10 Cushing's disease, Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, 
autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic 
dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with 
lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
15 glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 

hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
20 Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
such as multiple myeloma, and lymphomas such as Hodgkin's disease; a cell proliferative disorder 
such as arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease 
(MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, 
primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, 
25 melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, 
bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, 
kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; and an infection, such as infections by viral agents 
classified as adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, 
30 herpesvirus, flavivirus, orthomyxovirus, parvovirus, papovavirus, paramyxovirus, picornavirus, 

poxvirus, reovirus, retrovirus, rhabdovirus, and togavirus; infections by bacterial agents classified as 
pneumococcus, staphylococcus, streptococcus, bacillus, corynebactenum, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemophilus, iegionella, bordetella, 
gram-negative enterobacterium including shigella, salmonella, and Campylobacter, pseudomonas, 
35 vibrio, brucella, francisella, yersinia, bartonella, norcardium, actinomyces, mycobactenum, 
spirochaetale, nckettsia, chlamydia, and mycoplasma; infections by fungal agents classified as 
aspergillus, Blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, and 
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other fungal agents causing various mycoses; and infections by parasites classified as Plasmodium 
or malaria-causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis 
carinii, intestinal protozoa such as giardia, trichomonas, tissue nematodes such as trichinella, 
intestinal nematodes such as ascaris, lymphatic filarial nematodes, trematodes such as schistosoma, 
5 and cestrodes such as tapeworm. To assess abnormal expression of the present protein associated 
with any of these disorders, the level of the present polynucleotides or polypeptides can be detected 
in a biological sample or cell using any standard method, including Southern or northern analysis, 
dot blots, other membrane -based technologies, PCR technologies, dipstick, pin, ELISA assays, and 
in microarrays. Any of these methods may be used for the diagnosis of disorders characterized by 
10 an alteration of expression of SEQ ID NO:283 or 286, such as the disorders mentioned above, or in 
assays to monitor patients being treated with SEQ ID NO:283 or 286 or agonists, antagonists, or 
inhibitors of SEQ ID NO:283 or 286. Antibodies useful for diagnostic purposes may be prepared, 
e.g., in the same manner as that described in U.S. Patent No. 6,135,941. Diagnostic assays for SEQ 
ID NO:283 or 286 include methods which utilize the antibody and a label to detect SEQ ID NO: 
1 5 283 or 286 in human body fluids or in extracts of cells or tissues. The antibodies may be used with 
or without modification, and may be labeled by covalent or non-covalent attachment of a reporter 
molecule. A wide variety of reporter molecules, several of which are described above, are known in 
the art and may be used. 

In another embodiment, the protein of SEQ ID NO:283 or 286 or a fragment or derivative 
20 thereof may be administered to a subject to diagnose, treat or prevent an immune disorder 

associated with decreased expression or activity of the protein of the invention. Such disorders can 
include, but are not limited to, acquired immunodeficiency syndrome (AIDS), X-linked 
agammaglobinemia of Bruton, common variable immunodeficiency (CVI), DiGeorge's syndrome 
(thymic hypoplasia), thymic dysplasia, isolated IgA deficiency, severe combined immunodeficiency 
25 disease (SCID), immunodeficiency with thrombocytopenia and eczema (Wiskott-Aldrich 

syndrome), Chediak-Higashi syndrome, chronic granulomatous diseases, hereditary angioneurotic 
edema, immunodeficiency associated with Cushing's disease, Addison's disease, adult respiratory 
distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, 
30 Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic 

lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 
hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
35 syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
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such as multiple myeloma, and lymphomas such as Hodgkin's disease. In addition, such disorders 
associated with decreased protein expression or activity can be treated by administering to a patient 
polynucleotide sequences encoding the protein of the invention, e.g. inserted in an appropriate 
vector. In another example, a compound that increases either the activity of the protein of the 
5 invention or their expression can be administered to a patient to treat or prevent any of the diseases 
mentioned above. 

In a further embodiment, an antagonist of the protein of the invention may be administered 
to a subject to treat or prevent an immune disorder associated with increased expression or activity 
of the protein of SEQ ED NO:283 or 286 including, but not limited to, auto-immune deseases or 

10 graft rejection. In one aspect, an antibody which specifically binds the protein of the invention may 
be used directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express the proteins of the invention, such as the 
salivary gland tissue or the bone marrow tissue. In addition, sense, antisense nucleotides, GSE, 
ribozymes, specific protein inhibitors such as antibodies or small coumpounds can be administered 

15 to inhibit the expression of the proteins of the invention. 

In another embodiment, an antagonist of the protein of SEQ ED NO:283 may be 
administered to a subject to treat or prevent a cell proliferative disorder. Such disorders may 
include, but are not limited to, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 

20 polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, 
prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In one aspect, an antibody 

25 which specifically binds the protein of the invention may be used directly as an antagonist or 

indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue 
which express the protein of the invention. In another example, sense, antisense nucleotides, GSE, 
or ribozymes designed from nucleotides of the invention can be administered to inhibit the 
expression of the protein of the invention. 

30 Protein of SEQ ID NO; 41 1 (internal designation 181-10-1-0-C9-CS) 

The protein of SEQ ID NO: 41 1 encoded by the cDNA of SEQ ID No: 170 is highly 
expressed in fetal liver. The protein of the invention is homologous to peripheral benzodiazepine 
receptor/isoquinoline binding protein (PBR/IBP) of human, bovine and murine origin (Genbank 
accession numbers M36035, M64520 and LI 7306 respectively). The 170-amino-acid protein of 
35 SEQ ID NO: 41 1 is similar in size and hydropathic ity to known peripheral PBR/IBP 

benzodiazepine receptors/isoquinoline binding proteins. Like the known peripheral benzodiazepine 
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receptors/isoquinoline binding proteins, the protein of the subject invention has about five potential 
transmembrane domains at positions 3-23, 45-65, 82-102, 105-125 and 130-150. Moreover, the 
protein of the invention displays a stretch of 1 1 amino acids (starting with VI 44 and ending with 
R154) that corresponds to a recently identified putative cholesterol recognition/interaction amino 
5 acid consensus pattern (-L/V-(X)(l-5)-Y-(X)(l-5)-R/K-) [See Li et al, Endocrinology 1998 Dec; 
139 (12); 4991-7]. 

The peripheral benzodiazepine receptor (PBR) is a 18-kDa protein containing binding sites 
for benzodiazepine and is distinct from the GABA neurotransmitter receptor [Papadopoulos, V. 
(1993) Endocr. Rev. 14: 222-240]. Expression of PBR has been found in every tissue examined. 

10 However, it is most abundant in steroidogenic cells and is also found, primarily, on outer 

mitochondrial membranes [Anholt, R et al (1986) / Biol. Chem. 261:576-583]. PBR is thought to 
be associated with a multimeric complex composed of the 18-kDa isoquinoline binding protein and 
the 34-kDa pore-forming voltage dependent anion channel protein, preferentially located on the 
outer/inner mitochondrial membrane contact sites [McEnery, M.W. et al. Proc. Natl. Acad. Sci. 

15 USA. 89:3170-3174; Gamier, M. et al. (1994) Mol Pharmacol. 45:201-211; Papadopoulos, V. et 
al. (1994) Mol Cel. Endocr. 104:R5-R9]. Drug ligands of PBR, upon binding to the receptor, 
simulate steroid synthesis in steroidogenic cells in vitro [Papadopoulos, V et al (1990) J. Biol 
Chem. 265: 3772-3779; Barnea, E. R. et al (1989) Mol Cell Endocr. 64: 155-159; Amsterdam, A. 
and Suh, B.S. (1991) Endocrinology 128: 503-510]. Likewise, in vivo studies showed that high 

20 affinity PBR ligands increase steroid plasma levels in hypophysectomized rats [Amri, H. et al 
(1996) Endocrinology 137:5707-5718]. Further in vitro studies on isolated mitochondria provided 
evidence that PBR ligands, drug ligands, or the endogenous PBR ligand (the polypeptide diazepam- 
binding inhibitor (DBI) [Papadopoulos, V. et al (1997) Steroids 62: 21-28]) stimulate pregnenolone 
formation by increasing the rate of cholesterol transfer from the outer to the inner mitochondrial 

25 membrane [for review, see Culty, M. et al (1999) Journal of Steroid Biochemistry and Molecular 
Biology 69: 123-130]. 

Based on the amino acid sequence of the 18-kDa PBR, a three dimensional model was 
developed [Papadopoulos, V. (1996) In: The Leydig Cell. Payne, A. H. et al (eds) Cache River 
Press, IL, pp 596-628]. This model was shown to accommodate a cholesterol molecule and 

30 function as a channel, supporting the role of PBR in cholesterol transport. The role of PBR in 
steroidogenesis was also demonstrated by observing that PBR negative cells generated by 
homologous recombination failed to produce steroids [Papadopoulos, V. et al (1997) J Biol. Chem 
272: 32129-32135]. Further, cholesterol transport experiments in bacteria expressing the 1 8-kDa 
PBR protein provided definitive evidence for a function as a cholesterol channel/transporter 

35 [Papadopoulos, V. et al (1997) supra]. 

In addition to its role in mediating cholesterol movement across membranes, PBR has been 
implicated in several other physiological functions, including cell growth and differentiation, 
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chemotaxis, mitochondrial physiology, porphyrin and heme biosynthesis, immune response, anion 
transport and GABAergic regulation of CNS. [for review, see Gavish, M. et aL (1999) 
Pharmaceutical Reviews 51: 629-650; Beurdeley-Thomas, A. et aL (2000) Journal of ^euro- 
Oncology 46: 45-56]. Also, a recent report also indicates that PBR agonists are potent anti-apoptotic 
5 compounds. These findings suggest that this effect may represent a major function for this receptor 
(Bono, F. et aL (1999) Biochemical and Biophysical Research Communications 265:457-461]. 

It appears that PBR is associated with stress and anxiety disorders. It has been suggested 
that PBRs play a role in the regulation of several stress systems such as the HPA axis, the 
sympathetic nervous system, the renin-angiotensin axis, and the neuroendocrine axis. In these 
10 systems, acute stress typically leads to increases in PBR density, whereas chronic stress typically 
leads to decreases in PBR density. Furthermore, in Generalized Anxiety Disorder (GAD), Panic 
Disorder (PD), Generalized Social Phobia (GSP), and Post-Traumatic Stress Disorders (PTSD), 
PBR density is typically decreased in platelets. 

In the brain, where PBRs are associated with glial cells, PBRs are increased in specific 

1 5 brain areas in neurodegenerative disorders and also after neurotoxic and traurnatic-isc hemic brain 
damage [for review, see Gavish, M, et aL (1999) supra]. The literature also reports a decrease in 
peripheral -type benzodiazepine receptors in postmortems of chronic schizophrenics, suggesting that 
the decreased density of PBRs in the brain may be involved in the pathophysiology of 
schizophrenia. Increased levels of PBR in autopsied brain tissue from PSE patients (Portal- 

20 Systemic Encephalopathy patients) have been reported, thus supporting the theory that activation of 
PBR contributes to the pathogenesis characteristic of portal-systemic encephalopathy (PSE) in the 
central nervous system [Kurumaji, A. et al. (1997) J. Neural Transm 104:1361-1370; Butterworth 
R. F. (2000) Neurochemistry International 36: 41 1-416]. 

In addition to its involvement in the neurological disorders discussed supra, PBR has been 

25 implicated in the regulation of tumor cell proliferation [for review, see Gavish, M. et aL (1999) 
supra; Beurdeley-Thomas, A. et aL (2000) supra; Hardwick, M. (1999) Cancer Research 59:83 1 - 
842; Venturini, I. et aL (1998) Life Sci 63:1269-80; Carmel I et aL (1999) Bioc hem Pharmacol 58: 
273-8]. The invasiveness and metastatic ability of human breast tumor cells is proportional to the 
level of PBR expressed. Further, PBR has been proposed to be used as a tool/marker for detection, 

30 diagnosis, prognosis and treatment of cancer [WO 99/49316, hereby incorporated by reference in its 
entirety]. 

Many ligands have been described that bind to peripheral benzodiazepine receptor with 
various affinities. Some benzodiazepines, Ro 5-4864 [4-chlorodiazepam], diazepam and structurally 
related compounds, are potent and selective PBR ligands. Exogenous ligands also include 2- 
35 phenylquinoline carboxamides (PK 1 1 1 95 series), imidazo [ 1 ,2-a]pyridine-3-acetamides (Alpidem 
series) and pyridazine derivatives. Some endogenous compounds, including porphyrins and 
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diazepam binding inhibitor (DBI), bind to PBR with nanomolar and micromolar affinity [for 
review, see Gavish, M. et al. (1999) supra; Beurdeley- Thomas, A. et al. (2000) supra]. 

The protein of SEQ ID NO: 411 is a novel peripheral -type benzodiazepine receptor. As 
such, it is serves a channel function that mediates cholesterol movement across membranes, play a 
5 role in steroidogenesis, cell growth and differentiation, chemotaxis, mitochondrial physiology, 
protection against apoptosis, porphyrin and heme biosynthesis, immune response, anion transport 
and GABAergic regulation of CNS. 

In one embodiment, a preferred polypeptide of the invention comprises the ammo acids of 
SEQ ID NO: 41 1 from position 144 to 1 54. In another embodiment, the subject invention provides 
10 a polypeptide comprising the sequence of SEQ ED NO: 411. Other preferred polypeptides of the 
invention include biologically active fragments of SEQ ID NO: 411. Biologically active fragments 
of the protein of SEQ ID NO: 41 1 have any of the biological activities described herein which are 
associated with the PBR. In another embodiment, the polypeptide of the invention is encoded by 
clone 181-10-1-0-C9-CS. 

15 One aspect of the subject invention provides compositions and methods using the protein of 

the invention, or biologically active fragments thereof, for the development, identification, and/or 
selection of agents capable of modulating the expression or activity of the protein of the invention. 

Agents which modulate the activity of the PBR/IBP of the subject invention include, but are 
not limited to, antisense oligonucleotides, ribozymes, drugs, and antibodies. These agents may be 

20 made and used according to methods well known in the art. Also, the protein of the invention, or 
biologically active fragments thereof, may be used in screening assays for therapeutic compounds. 
A variety of drug screening techniques may be employed. In this aspect of the invention, the 
protein or biologically active fragment thereof, may be free in solution, affixed to a solid support, 
recombinantly expressed on, or chemically attached to, a cell surface, or located intracellularly. 

25 The formation of binding complexes, between the protein of the invention, or biologically active 
fragments thereof, and the compound being tested, may then be measured. 

In one embodiment, the subject method utilizes eukaryotic or prokaryotic host cells which 
are stably transformed with recombinant nucleic acids expressing the PBR/IBP polypeptide or 
biologically active fragments thereof. The transformed cells may be viable or fixed. Drugs or 

30 compounds which are candidates for the modulation of the PBR/IBP, or biologically active 
fragments thereof, are screened against such transformed cells in binding assays well known to 
those skilled in the art. Alternatively, assays such as those taught in Geysen H. N., WO Application 
84/03564, published on Sep. 13, 1984, and incorporated herein by reference in its entirety, may be 
used to screen for peptide compounds which demonstrate binding affinity for, or the ability to 

35 modulate, the PBR/IBP, or biologically active fragments thereof. In another embodiment, 

competitive drug screening assays using neutralizing antibodies specifically compete with a test 
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compound for binding to the PBR/IBP protein of the invention, or biologically active fragments 
thereof. 

Another embodiment of the subject invention provides compositions and methods of 
selectively modulating the expression or activity of the protein of the invention. Modulation of the 
5 PBR/IBP would allow for the successful treatment and/or management of diseases or biochemical 
abnormalities associated with the PBR or PBR/fBP. Antagonists, able to reduce or inhibit the 
expression or the activity of the protein of the invention, would be useful in the treatment of 
diseases associated with elevated levels of the PBR/IBP, increased cell proliferation, or increased 
cholesterol transport. Thus, the subject invention provides methods for treating a variety of diseases 
10 or disorders, including, but not limited to, cancers, especially liver cancer, and portal -systemic 
encephalopathy. 

Alternatively, the subject invention provides methods of treating diseases or disorders 
associated with decreased levels of the protein of the PBR/IBP. Thus, the subject invention provides 
methods of treating diseases including, and not limited to, schizophrenia, chronic stress, GAD, PD, 
15 GSP and PTSD. Other diseases which may be treated by agonists of the PBR/IBP of the subject 
invention include those diseases associated with decreases in cell proliferation, e.g. developmental 
retardation. 

Furthermore, because the PBR/IBP of the subject invention is also able to transport 
cholesterol into cells, the subject invention may also be used to increase cholesterol transport into 
20 cells. Diseases associated with cholesterol transport deficiencies include lipoidal adrenal 

hyperplasia, and diseases where there is a requirement for increased production of compounds 
requiring cholesterol such as myelin and myelination, such as Alzheimer's disease, spinal chord 
injury, and brain development neuropathy [Snipes, G. and Suter, U. (1997) Cholesterol and Myelin. 
In: Subcellular Biochemistry, Robert Bittman (ed.), vol. 28, pp. 173-204, Plenum Press, New York]. 
25 The methods of treating disorders associated with decreased levels of PBR/IBP may be practiced by 
introducing agonists which stimulate the expression or the activity of the protein of the invention. 

In one embodiment, methods of increasing the levels of PBR/IBP in tissues or cell types 
may be practiced by utilizing nucleic acids encoding the protein of the subject invention, or 
biologically active fragments thereof, to introduce biologically active polypeptide into targeted cell 
30 types. Vectors useful in such methods are known to those skilled in the art as are methods of 
introducing such nucleic acids into target tissues. 

Agents which stimulate or inhibit the activity of the protein of the invention include but are 
not limited to agonist and antagonist drugs respectively. These drugs can be obtained using any of a 
variety of drug screening techniques as discussed above. 
35 Antagonists of the PBR/IBP encoded by SEQ ID NO: 1 70 include agents which decrease 

the levels of expressed mRNA encoding the protein of SEQ ID NO: 411. These include, but are not 
limited to, RNAi, one or more nbozymes capable of digesting the protein of the invention mRNA ; 
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or antisense oligonucleotides capable of hybridizing to mRNA encoding the PBR/IBP of SEQ ID 
NO: 41 1 Antisense oligonucleotides can be administrated as DNA, as DNA entrapped in 
proteoliposomes containing viral envelope receptor proteins [Kanoda, Y. et aL (1989) Science 243: 
375] or as part of a vector which can be expressed in the target cell and provide antisense DNA or 
5 RNA. Vectors which are expressed in particular cell types are known in the art. Alternatively, the 
DNA can be injected along with a carrier. A carrier can be a protein such as a cytokine, for example 
interleukin 2, or polylysine-glycoprotein earners. Carrier proteins, vectors, and methods of making 
and using polylysine carrier systems are known in the art. Alternatively, nucleic acid encoding 
antisense molecules may be coated onto gold beads and introduced into the skin with, for example, 
10 a gene gun [Ulmer, J.B. et aL (1993) Science 259: 1745]. 

Antibodies, or other polypeptides, capable of reducing or inhibiting the activity of PBR/IBP 
may be provided as in isolated and substantially purified form. Alternatively, antibodies or other 
polypeptides capable of inhibiting or reducing the activity of the PBR/IBP protein, may be 
recombinantly expressed in the target cell to provide a modulating effect. In addition, compounds 
1 5 which inhibit or reduce the activity of the PBR/IBP protein of the subject invention may be 

incorporated into biodegradable polymers being implanted in the vicinity of where drug delivery is 
desired. For example, biodegradable polymers may be implanted at the site of a tumor or, 
alternatively, biodegradable polymers containing antagonists/agonists may be implanted to slowly 
release the compounds systemically. Biodegradable polymers, and their use, are known to those of 

20 skill in the art (see, for example, Brem et aL ( 1 99 1 ) J. Neurosurg. 74:44 1 -446. 

In another embodiment, the invention provides methods and compositions for detecting the 
level of expression of the mRNA of the protein of the invention. Quantification of rnRNA levels of 
the PBR/IBP protein of the invention may be useful for the diagnosis or prognosis of diseases 
associated with an altered expression of the protein of the invention. Assays for the detection and 

25 quantification of the mRNA of the protein of the invention are well known in the art (see, for 
example, Mamatis, Fitsch and Sambrook, Molecular Cloning; A Laboratory Manual (1982), or 
Current Protocols in Molecular Biology, Ausubel, F.M. et aL (Eds), Wiley & Sons, Inc.). 

Polynucleotides probes or primers for the detection of the mRNA of the protein of SEQ ED 
NO: 41 1 can be designed from the cDNA of SEQ ID NO: 170. Methods for designing probes and 

30 primers are known in the art. In another embodiment, the subject invention provides diagnostic kits 
for the detection of the mRNA of the protein of the invention in cells. The kit comprises a package 
having one or more containers of oligonucleotide primers for detection of the protein of the 
invention in PCR assays or one or more containers of polynucleotide probes for the detection of the 
mRNA of the protein of the invention by in situ hybridization or Northern analysis. Kits may, 

35 optionally, include containers of various reagents used in various hybridization assays. The kit may 
also, optionally, contain one or more of the following items: polymerization enzymes, buffers, 
instructions, controls, or detection labels. Kits may also, optionally, include containers of reagents 
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mixed together in suitable proportions for performing the hybridization assay methods in 
accordance with the invention. Reagent containers preferably contain reagents in unit quantities that 
obviate measuring steps when performing the subject methods. 

In another embodiment, the invention relates to methods and compositions for detecting and 
5 quantifying the level of the protein of the invention present in a particular biological sample. These 
methods are useful for the diagnosis or prognosis of diseases associated with an altered levels of the 
protein of the invention. Diagnostic assays to detect the protein of the invention may comprise a 
biopsy, in situ assay of cells from organ or tissue sections, or an aspirate of cells from a tumor or 
normal tissue. In addition, assays may be conducted upon cellular extracts from organs, tissues, 
10 cells, unne, or serum or blood or any other body fluid or extract. 

Assays for the quantification of the PBR/IBP of SEQ ID NO: 41 1 may be performed 
according to methods well known in the art. Typically, these assays comprise contacting the sample 
with a ligand of the protein of the invention or an antibody (polyclonal or monoclonal) which 
recognizes the protein of the invention or a fragment thereof, and detecting the complex formed 
1 5 between the protein of the invention present in the sample and the ligand or antibody. Fragments of 
the ligands and antibodies may also be used in the binding assays, provided these fragments are 
capable of specifically interacting with the BRP/IRP of the subject invention. Further, the ligands 
and antibodies which bind to the BRP/IRP of the invention may be labeled according to methods 
known in the art. Labels which are useful in the subject invention include, but are not limited to, 
20 enzymes labels, radioisotopic labels, paramagnetic labels, and chemi luminescent labels. Typical 
techniques are described by Kennedy, J. H., et al. (1976) Clin. Chim, Acta 70:1-31; and Schurs, A. 
H. etal (1977) Clin, Chim. Acta SI: 1-40. 

The subject invention also provides methods and compositions for the identification of 
metastatic tumor masses. In this aspect of the invention, the polypeptides and antibodies which 
25 bind the polypeptides of the invention may be used as a marker for the identification of the 
metastatic tumor mass. Metastatic tumors which originated from the liver may overexpress the 
PBR/IBP of SEQ ID NO: 411, whereas newly forming tumors, or those originating from other 
tissues are not expected to bear the PBR/IBP of SEQ ID NO: 411. 

Protein of SEP ID NO: 397 internal designation 160-28-4-0-C4-C SV 

30 The protein of SEQ ID NO: 397, encoded by the cDNA of SEQ ID NO: 156 (clone 160-28- 

4-0-C4-CS), exhibits homology to the ADP-ribosylation factors (ARE) family of proteins. The 
ARE family includes ADP-nbosylation factors (ARE s) and ARE-like proteins (ARLs); the ARE 
family of proteins is one family of the Ras superfamily. Proteins belonging to the Ras superfamily 
have molecular weights of 18-30 kDa and function in a variety of cellular processes including, but 

35 not limited to, signaling, growth, immunity, and protein transport. 
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ARFs are monomeric GTP-binding proteins, related structurally to both G protein alpha- 
subunits and Ras proteins. ARF family members share more than 60% sequence identity, appear to 
be ubiquitous in eukaryotes, and are evolutionarily highly conserved throughout. Immunologically, 
they have been localized to the Golgi apparatus of several types of cells (Stearns et al. Proc. Natl. 
5 Acad. Sci. (USA) 87: 1238-1242 (1990)). ARF proteins enhance the ADP-ribosyltransferase 
activity of cholera toxin as an allosteric activator (Noda et al. Biochim. Biophys. Acta 1034: 195- 
199 (1990)). ARFs have also been shown to act as regulatory molecules, or "switches", for linking 
two processes (e.g., the process of vesicle fission from a donor compartment and fusion with an 
acceptor compartment (Rothman, J. E. and Wieland, F. T. Science 272: 227-234 (1996)). ARF 
10 family members fall into three classes, classes Mil, according to their size and sequence homology. 
Class I comprises ARFI , ARF2, and ARF3; Class II comprises ARF4 and ARFS; and Class III 
comprises ARF6. 

The classes occupy different subcellular locations and have been implicated in different 
transport pathways. Class I ARFs localize to the Golgi where they are involved in the regulation of 

15 ER-Golgi and intra-Golgi transport. Class I ARFs are also involved in the recruitment of cytosolic 
coat proteins to Golgi membranes during the formation of transport vesicles. Class III (e.g., ARF6) 
localizes to a tubulovesicular compartment, secretory granules, and the plasma membrane, where it 
is involved in regulated secretion and recycling. Class II ARFs appear to be cytosolic, but their role 
has not been elucidated. (Radhakrishna, H. and Donaldson, J. G. J. Cell Biol. 139: 49-61(1997)). 

20 ARF function, in general, is regulated by a GDP-GTP cycle. For example, ARFI is 

cytosolic in the GDP bound state, but is associated with membranes when in the GTP bound state. 
A guanine nucleotide exchange factor (GEF) in the donor compartment recruits ARFI to the 
membrane. At the membrane, GTP- ARFI recruits coat proteins, which assemble together into 
spherical coats, budding off vesicles in the process. After budding, hydrolysis of bound GTP causes 

25 ARFI to dissociate from the membrane. ARFI dissociation causes the coat to become unstable and 
dissociate as well. (Rothman, supra.) 

Members of the ARF multigene family, when expressed as recombinant proteins in E. coli, 
display different phospholipid and detergent requirements (Price, et al. J. Biol. Chem. 267: 17766- 
17772 (1992)). Some lipids and/or detergents, e.g., SDS, cardiolipin, 

30 dimyristoylphosphatidylcholine (DMPC)/cholate, enhance ARF activities (Bobak, et al. 

Biochemistry 29:855-861 (1990); Noda, et al. Biochim. Biophys. Acta 1034: 195-199(1990); Tsai, 
et al. J. Biol. Chem. 263:1768-1772 (1988)). ARFs also activate phospholipase D (PLD), a 
membrane -bound enzyme implicated as an effector of several growth factors (Boman, A. L. and 
Kahn, R. A. Trends Biochem. Sci. 20: 147-150 (1995). PLD1 has been shown to be activated by a 

35 variety of G-protein regulators, for example, PKC (protein kinase C) and ADP-nbosylation factor 
(ARF). PKC and ARFs may regulate G-proteins either individually or together in a synergistic 
manner. Recently the role of ARFs in microtubules formation has also been demonstrated. ADP- 
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ribosylation of tubulin almost completely blocked self-assembly of this protein in brain (Terashima 
M. et a; J.Nutr Sci Vitaminol 45: 393-400 (1999)). 

In general, differences in the various ARF sequences are concentrated in the amino-terminal 
regions and the carboxyl portions of the proteins. Only three of 1 7 amino acids in the amino termini 
5 have shown to be identical among ARFs, and four amino acids in this region of ARFs 1-5 are 

missing in ARF 6 (Tsuchiya, et al. J. Biol. Chem. 266: 2772-2777 (1991)). It was reported (Kahn, et 
al. J. Biol. Chem. 267: 1 3039-1 3046 (1992)) that the amino-terminal regions of ARF proteins form 
an alpha-helix and that this domain is required for membrane targeting, interaction with lipid, and 
ARF activity. 

10 Schliefer et al., (J. Biol. Chem. 257: 20-23 (1991)) have described a protein distinctly larger 

than ARF that possessed ARF-like activity. ARF -like proteins, or ARLs, have been found in 
different species. Some of ARLs appear to lack ADP-ribosylrransferase-enhancing activity; ARLs 
may differ in GTP-bindmg requirements and GTPase activity as compared to various ARF 
isoforms. For example, ARP, a mammalian ARL, is 33-39% identical to members of the ARF 

1 5 family; ARP, however, differs from other ARF family proteins by virtue of its ability to hydrolyze 
bound GTP in the absence of other proteins. ARP protein, unlike ARFs, is typically associated with 
plasma membrane instead of the cytosol (Schurmann, A. J. Biol. Chem. 270, 30657-30663 (1995)). 

ARF family members have been implicated in several disease processes, such as Lowe's 
syndrome, an X-linked disorder characterized by congenital cataracts, renal tubular dysfunction and 

20 neurological deficits. These disorders may be due to an inability to recruit ARF to the Golgi 

membrane (Suchy, S. F. et al. Hum. Mol. Genet. 4: 2245-2250 (1995), Londono I. et al. Kidney Int. 
55: 1407-1416 (1999)). It has also been suggested that regulation of ARF is also involved in cystic 
fibrosis, Dent's disease, diabetes, and autosomal dominant polycystic kidney disease (Marshansky, 
V., et al. Electrophoresis 18: 2661-2676 (1997)). 

25 The new human ARF-related protein of SEQ ID NO:397, encoded by clone 160-28-4-0-C4- 

CS in one embodiment, and the related polynucleotides, provide new compositions which are useful 
in the diagnosis, treatment, and prevention of secretory, exocytosis, endocytosis and another 
"sorting disorders." 

The subject invention provides a polypeptide comprising the amino acid sequence of SEQ 
30 ID NO: 397 or clone 1 60-28-4-0-C4-CS, or biologically active fragments thereof. The intact protein 
of interest is 173 amino acids in length, has an ARF family amino acid motif (Pfam), and has 
ATP/GTP-bmding site motif A P-Ioop (PS00017). The protein of SEQ ID NO: 397 or clone 160- 
28-4-0-C4-CS also has chemical and structural similarity with human ARL1 (P40616), ARD-1 
(R66033) and ARF6 (GI 178989) (31%, 31% and 27% identity, respectively). The amino acid 
35 length of SEQ ID NO: 397 is similar to those of the aforementioned ARFs Biologically active 

fragments of SEQ ID NO: 397 have one or more of the biological activities typically associated the 
full length protein. In one embodiment, the protein is encoded by clone 160-28-4-0-C4-CS 
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The invention also provides variants of the protein of SEQ ID NO: 397 or clone 160-28-4- 
0-C4-CS. The variants have at least about 80%, more preferably at least about 90%, and most 
preferably at least about 95% amino acid sequence identity to the amino acid sequence of SEQ ID 
NO: 397 or clone 160-28-4-0-C4-CS. Variants according to the subject invention have at least one 
5 functional and/or structural characteristic of ARFs. The invention also provides biologically active 
fragments of the variant proteins. 

The invention includes those polynucleotides encoding the protein of SEQ ID NO: 397 or 
clone 160-28-4-0-C4-CS, variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, and biologically 
active fragments of both the protein of SEQ ID NO: 397 or clone 1 60-28-4-0-C4-CS and variants 
10 thereof. As is apparent to those skilled in the art, a variety of different DNA sequences can encode 
the amino acid sequence of the proteins, variants, and biologically active fragments of said proteins 
and variants. It is well within the skill of a person trained in the art to create these alternative DNA 
sequences encoding proteins having the same, or essentially the same, amino acid sequence. These 
variant DNA sequences are also within the scope of the subject invention. As used herein, 
1 5 reference to "essentially the same" sequence refers to sequences that have amino acid substitutions, 
deletions, additions, or insertions that do not materially affect biological activity. 

The subject invention provides method of treating cytoskeletal, secretory, and inflammatory 
disorders/conditions comprising the administration of therapeutically effective amounts of a 
composition comprising the protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. These 
!0 methods can also be practiced using variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or 
biologically active fragments of either SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or variants of 
SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. Disorders/conditions which can be treated by the 
subject invention include, but are not limited to, prostate cancer, brain and another rumors, Lowe's 
syndrome, glomerulonephritis, chronic glomerulonephritis, tubulointerstitial nephritis, inherited X- 
5 linked nephrogenic diabetes insipidus, autosomal dominant polycystic kidney disease (ADPKD), 
herpes gestationis, dermatitis herpetiformis, lupus erythematosus, Crohn's disease, irritable bowel 
syndrome and Addison's disease; secretory/endocytotic disorders such as cystic fibrosis, glucose- 
galactose malabsorption syndrome, hypercholesterolemia, hyper- and hypoglycemia, Grave's 
disease, goiter, and Cushing's disease; conditions associated with abnormal vesicle trafficking, 
0 including acquired immunodeficiency syndrome (AIDS); allergies including hay fever, asthma, and 
urticaria (hives); autoimmune hemolytic anemia; multiple sclerosis; myasthenia gravis; rheumatoid 
and osteoarthritis; Chediak-Higashi and Sjogren's syndromes; toxic shock syndrome; traumatic 
tissue damage; viral, bacterial, fungal, helminthic, and protozoal infections. 

In another embodiment, a vector capable of expressing the protein of SEQ ID NO: 397 or 
5 clone 160-28-4-0-C4-CS, or biologically active fragments thereof, can be administered to a subject 
to treat or prevent disorders including, but not limited to, those described above. Alternatively, the 
vector can encode a variant, or biologically active fragment of the variant protein. Multiple vectors 
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encoding any combination of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS and/or variants can 
be administered to a subject. 

In a further embodiment, a pharmaceutical composition comprising a substantially purified 
5 protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments 
thereof), in conjunction with a suitable pharmaceutical carrier, can be administered to a subject to 
treat or prevent the above mentioned disorders. Alternatively, a pharmaceutical composition 
comprising a substantially purified variant protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS 
(and/or biologically active fragments thereof), in conjunction with a suitable pharmaceutical carrier, 

10 can be administered in the aforementioned therapeutic regimens. As would be apparent to the 
skilled artisan, any therapeutically effective combination of the protein encoded by SEQ ID NO: 
397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof) and variants of SEQ 
ID NO:397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof), in 
conjunction with a suitable pharmaceutical carrier can be used in the aforementioned therapeutic 

15 regimens. 

ARFs are known to be involved in regulated transport of vesicles. Therefore, in another 
embodiment, the protein of SEQ ID No: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of said proteins and/or variants can be used as a component of drug 
delivery vehicles such as colloids or liposomes. The protein of SEQ ID NO: 397 or clone 160-28-4- 

20 0-C4-CS, variants, and/or biologically active fragments of said proteins and/or variants can be 

incorporated into the lipid membranes of liposomes and can serve as specific targeting agents. The 
methods of design of such drug delivery systems is known by those skilled in the art and can be 
practiced according to conventional pharmaceutical principles (Smith H J. Introduction to the 
principles of drug design and action, 3 rd ed. (1998); Chien Y.W. Novel Drug Delivery systems, 2 nd 

25 ed, (1992); Storm G. et al J.Liposome Res. 4: 641-666 (1994); and Crommelin D.J.A. et al. Adv. 
Drug Deli very Rev. 17 : 49-60(1995)). 

In another embodiment of the invention, the polynucleotides encoding the protein of SEQ 
ID NO: 397 or clone I60-28-4-0-C4-CS can be used for therapeutic purposes. Polynucleotides 
encoding fragments of the protein of SEQ ID NO:397 or clone 160-28-4-0-C4-CS can also be used 

30 in therapeutic regiments. In one aspect, the complement of the polynucleotide encoding the protein 
of SEQ ID NO.: 397 or clone 1 60-28-4-0-C4-CS can be used in situations in which it would be 
desirable to block the transcription of the mRNA. Modifications of gene expression can be obtained 
by designing complementary sequences or anti sense molecules (DNA, RNA, or PNA) to the 
control, 5', or regulatory regions of the gene encoding the protein of interest. Such technology is 

35 now well known in the art, and sense or antisense oligonucleotides or larger fragments can be 
designed from various locations along the coding or control regions of sequences encoding the 
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protein of interest. Methods of treatment utilizing antisense technology are also well known to 
those skilled in the art. 

Another embodiment of the invention provides methods of assessing PLD modulation by 
using ARF properties of the protein of interest. 
5 In another embodiment, antibodies which specifically bind the protein of SEQ ID NO: 397 

or clone 160-28-4-0-C4-GS can be used for the diagnosis of disorders characterized by expression 
of the protein, or in assays to monitor patients being treated with the protein of interest. Methods of 
making both polyclonal and monoclonal antibodies are well-known in the art. Diagnostic assays 
which can be used in this aspect of the invention include, and are not limited to, ELISAs, RIAs, and 
10 FACS, and are well known in the art. These assays also provide a basis for diagnosing or 
identifying altered or abnormal levels of SEQ ID NO:397 or the polypeptides encoded by the 
human cDNA of clone 160-28-4-0-C4-CS expression as compared to normal individuals. These 
screening methods are, likewise, well known to the skilled artisan. 

In another embodiment of the invention, the protein of interest, its catalytic or immunogenic 
15 fragments, or oligopeptides thereof can be used for screening libraries of compounds in any of a 
variety of drug screening techniques. The fragment employed in such screening can be free in 
solution, affixed to a solid support, recombinantly expressed on, or chemically attached to, a cell 
surface, or located intracellularly. The formation of binding complexes between the protein of 
interest and the agent being tested can be measured by methods well known to those skilled in the 
20 art. Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest. (See, e.g., Geysen, et al. (1984) PCT 
application WO84/03564.) 

In another embodiment of the invention, the polynucleotides encoding the protein of interest 
can be used for diagnostic purposes. The polynucleotides can be used to detect and quantify gene 
25 expression in biopsied tissues in which expression of the protein of interest can be correlated with a 
disease or condition. Such diagnostic assays are well known in the art and can be used to monitor 
regulation of the protein of interest levels during therapeutic intervention and/or to determine 
absence, presence, and excess expression of the protein of interest. Examples of such conditions 
and disorders have been provided supra. The polynucleotide sequences encoding the protein of 
30 interest can be used, for example, in Southern or Northern analyses, dot blot, or other membrane- 
based technologies; in PGR technologies; in dipstick, pin, and ELISA assays; and in microarrays 
utilizing fluids or tissues from patients to detect altered expression of the protein of SEQ ID 
NO:397 or clone 160-28-4-0-C4-CS. Such qualitative or quantitative methods are well known in the 
art. 

35 In further embodiments, oligonucleotides or longer fragments derived from any of the 

polynucleotide sequences described herein can be used as targets in a microarray. The microarray 
can be used to monitor the expression level of large numbers of genes simultaneously and to 
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identify genetic variants, mutations, and polymorphisms. This information can be used to determine 
gene function, to understand the genetic basis of a disorder, to diagnose a disorder, and to develop 
and monitor the activities of therapeutic agents. Microarrays can be prepared, used, and analyzed 
using methods known in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No. 5,474,796; 
5 Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. 94: 2150-2155; and Heller, M. J. et al. (1997) U.S. 
Pat. No. 5,605,662.) 

Another embodiment of the subject invention provides nucleic acid sequences encoding the 
protein of interest which can be extended utilizing a partial nucleotide sequence and various PCR- 
based methods. This aspect of the invention provides methods for the detection of upstream 

10 sequences, such as promoters and regulatory elements. Methods of practicing this aspect of the 
invention are also well known in the art. 

In other embodiments of the disclosed therapeutic regimens, any of the proteins, variants, 
biologically active fragments, antibodies, complementary sequences, or vectors of the invention can 
be administered in combination with other appropriate therapeutic agents. Selection of the 

15 appropriate agents for use in combination therapy can be made by one of ordinary skill in the art. 
The combination of therapeutic agents can act synergistically to effect the treatment or prevention 
of the various disorders described above. In particular, purified protein can be used to produce 
antibodies or to screen libranes of pharmaceutical agents to identify those which specifically bind 
the protein of interest. Neutralizing antibodies especially preferred for therapeutic use. 

20 Protein of SEQ ID NO: 287 (internal designation 1 74-5-3 -0-H7-CS) 

The protein of SEQ ID NO: 287, encoded by human cDNA of SEQ ID NO: 46 (clone 1 74- 
5-3-0-H7-CS), is highly homologous (more than 99% identity in amino acids) to the human protein 
encoded by the CLN8 gene listed in Genbank under accession number AF 123 75 7. The two 
proteins differ by two conservative amino-acid substations (alanine for valine at position 155 and 

25 serine for asparagine at position 225). In addition, the protein encoded by 174-5-3-0-H7-CS 

contains seven transmembrane domains. These domains are located at amino acids 25-45, 71 -91 , 
100-120, 133-153, 160-180, 205-225, and 228-248 as predicted by the software TopPred II (Claros 
and von Heijne, CABIOS applic. Notes, 10:685-686 (1994)). The protein encoded by SEQ ID 
NO: 287 also exhibits a signal peptide at positions 1-50 and a retention signal KKRP from positions 

30 283 to 286. 

CLN8 was identified recently by positional cloning (Ranta et al., Nat Genet. 1999 
Oct.;23(2):233-6). CLN8 encodes a 286 amino-acid putative transmembrane protein with no 
homology to previously known proteins. A naturally-occumng missense mutation in codon 24 
(R24G at the border of the first putative transmembrane domain) is the molecular basis for EPMR 
35 ("progressive epilepsy with mental retardation", MIM 600143). EPMR, also called Northern 

Epilepsy, is an autosomal recessive disorder characterized by normal early development, onset of 
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generalized tonic-clonic seizures between the ages of 5 and 10 years, and subsequent progressive 
mental retardation. Neuropathologtcal findings have shown that EPJV1R is a new member of the 
neuronal ceroid lipofuscinosis (NCL) group of neurodegenerative disorders. The NCLs are a 
genetically heterogeneous group of progressive neurodegenerative disorders characterized by the 
5 accumulation of autofluorescent lipopigment in various tissues. CLN8 is the eighth gene to be 
linked to the NCL group of neurodegenerative disorders. 

Subsequently, the homologous mouse gene (Cln8) was sequenced (82% nucleotide identity 
with the human gene) and localized to the region of the mouse genome linked to motor neuron 
degeneration, mouse mnd. Mnd is a naturally-occurring mouse mutant with intracellular 
10 autofluorescent inclusions similar to those seen in EPMR. A mutation in mnd mouse DNA was 
identified, indicating that mnd is a murine ortholog for CLN8 (Ranta et al., Nat Genet. 1999 
Oct;23(2):233-6), and that mice containing mutations in Cln8 represent a murine model for NCL 
disorders. 

Recent experimental evidence has confirmed the transmembrane nature of the CLN8 

15 protein (Lonka L et al., Hum Mol Genet. 2000 Jul 1 ;9(1 1): 1691-7). CLN8 resides in the 
endoplasmic reticulum (ER) and recycles between the ER and the ER-Golgi intermediate 
compartment (ERGIC) via a KKXX ER-retrieval motif at its C-terminus (KKRP, amino-acids 283- 
286). This motif is recognized and bound by COPI, a vesicle-coating protein found in retrograde 
vesicles delivering cargo from the cis Golgi to the ER. The 30kD CLN8 protein is not processed 

20 during its maturation (in particular it is not N-glycosylated). The EPMR-associated R24G mutation 
does not alter cellular localization in humans. 

The subject invention provides a polypeptide encoded by SEQ ID NO: 287 and biologically 
active fragments of said polypeptide. Compositions comprising polypeptides and pharmaceutically 
acceptable carriers are likewise provided. Preferred polypeptides, and biologically active fragments 

25 thereof, have any of the biological activities or domains/motifs described herein and/or contain the 
amino acids of positions 155 and 225, 283 to 286. In one embodiment, the protein/polypeptide of 
SEQ ID NO: 287 is encoded by clone 1 74-5-3-0-H7-CS. 

The ER7ERGIC cellular localization of protein of this invention can be used to target 
compounds to the ER/ERGIC. This targeting can be observed using any of the techniques known to 

30 those skilled in the art including those described in (Lonka L et al., Hum Mol Genet. 2000 Jul 
1;9(1 l).T691-7). In this aspect of the invention, the protein of SEQ ID NO: 287, or biologically 
active fragments thereof can be used to target liposomes, vesicles, or colloids to the ER/ERGIC 
compartment where active agents can be delivered. Methods of making and using targeted 
liposomes are well known in the art. 

35 In another embodiment, liposomes comprising the protein of SEQ ID NO: 287 can contain a 

second targeting agent for the specific selection of a target cell. The second targeting agent can be 
selected for its ability to specifically target a cell or tissue. Thus, the second targeting agent can be 
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specific for tumor markers, such as HER2. Alternatively, markers associated with specific cell 
types can be used (e.g., CD34, CD4, CD8, etc.). In a preferred embodiment, the second targeting 
agent is an antibody. Active agents include, but are not limited to, chemotherapeutic agents protein 
cross-linking agents, inhibitors of protein synthesis, anti-bactenal agents (e.g., antibiotics), antiviral 
5 agents, and/or anti-parasitic agents. The ability to bind the COPI coatomer can be assayed as 
described in (Cosson P, Letourneur F, Science. 1994 Mar 18;263(5 153): 1629-31). 

In another embodiment, the present invention provides methods of, and compositions for, 
identifying specific cellular compartments, such as the ER, ERGIC, and retrograde transport 
vesicles. This embodiment provides antibodies which specifically bind the protein of SEQ ID 

10 NO: 287, or biologically active fragments thereof, which are labeled with detectable markers, such 
as gold particles, enzymes, radioisotopes, or paramagnetic labels. ER, ERGIC, and retrograde 
transport vesicles can be identified in samples according to well-known immuno-diagnostic 
protocols. The antibodies, either monoclonal or polyclonal, can be made according to well-known 
methods. In a preferred embodiment, the antibodies bind to ER retention signal. 

1 5 In another embodiment, the protein of the invention or part thereof can be used as a reagent 

for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions, which include, but are not limited to, asthma, pulmonary 
edema, atherosclerosis, restenosis, stroke potential, thrombosis and hypertension. Similarly, the 
protein of the invention, or biologically active fragments thereof, and antibodies thereto can provide 

20 immunological probes for differential identification of the tissue(s) or cell type(s). In a number of 
disorders listed above, particularly of the pulmonary and cardiovascular systems, expression of this 
protein at significantly higher or lower levels can be routinely detected in certain tissues or cell 
types (e. g., vascular tissues, cancerous and wounded tissues) or bodily fluids (e. g., lymph, serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 

25 individual having such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Indeed, the 80 first amino-acids of the protein of the invention are identical to two 
polypeptides claimed in Patent WO 99/35158, hereby incorporated by reference in its entirety (SEQ 
ID NO:98 and SEQ ID NO: 162 corresponding to Geneseq accession numbers Y38413/Y38428 and 

30 Y38492) are over-expressed in pulmonary and endothelial tissues. 

The tissue distribution in pulmonary and endothelial tissues indicates that the protein 
product described in WO 99/35158 is useful for the treatment and diagnosis of cardiovascular and 
respiratory or pulmonary disorders such as asthma, pulmonary edema, pneumonia, atherosclerosis, 
restenosis, stroke, angina, thrombosis hypertension, inflammation, and wound healing. Those 

35 conditions can be diagnosed by determining the amount of the protein of the invention in a sample. 
Thus, antibodies raised against the protein of SEQ ID NO: 287, or an immunogenic fragment of the 
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protein can be used in diagnostic, prognostic, or screening assays such as those taught in WO 
99/35158. 



Protein of SEP ID No. 270 (internal designation 116-1 19-3-0-H5-CS) 

The protein of SEQ ID NO: 270 encoded by the extended cDNA SEQ ID NO: 29 is 
5 homologous to the human mitochondrial ATP synthase f subunit or ATPK (E.C. 3.6. 1 .34) (Swissprot 
accession number P56134) and is overexpressed in fetal kidney. 

The protein of SEQ ID NO; 270, composed of 88 amino acid residues, contains 1 
transmembrane segment (position 1 to 55) predicted by the software TopPred II (Claros and von 
Heijne, CABIOS applic. Notes, 10 :685-686 (1994). BLAST results show that 100% homology is found 

1 0 between amino acids 5 to 88 of the protein of the invention and amino acids 1 0 to 93 of human ATP 
synthase f chain (93 amino acids total), exon 1 of the cDNA SEQ ID NO: 29 making the difference 
between the 2 proteins (the last 3 exons show 100% homology). Thus, the protein of the invention 
represents a new isoform of human mitochondrial ATP synthase f subunit. It is interesting to note that 
the same splice variant is found in bovin, pig and mouse species. 

1 5 The mitochondrial electron transport (or respiratory) chain is a series of enzyme complexes in 

the mitochondrial membrane that is responsible for the transport of electrons from NADH to oxygen 
and the coupling of this oxidation to the synthesis of ATP (oxidative phosphorylation). ATP then 
provides the 

primary source of energy for driving a cell's many energy-requiring reactions. ATP synthase 

20 (F0 F 1 ATPase) is the enzyme complex at the terminus of this chain and serves as a reversible coupling 
device that intercon verts the energies of an electrochemical proton gradient across the mitochondrial 
membrane into either the synthesis or hydrolysis of ATP. This gradient is produced by other enzymes 
of the respiratory chain in the course of electron transport from NADH to oxygen. When the cell's 
energy demands are high, electron transport from NADH to oxygen generates an electrochemical 

25 gradient across the mitochondrial membrane. Proton translocation from the outer to the inner side of the 
membrane drives the synthesis of ATP. Under conditions of low energy requirements and when there is 
an excess of ATP present, this electrochemical gradient is reversed and ATP synthase hydrolyzes ATP. 
The energy of hydrolysis is used to pump protons out of the mitochondrial matrix. ATP synthase is, 
therefore, a dual complex, the F0 portion of which is a transmembrane proton carrier or pump, and the 

30 Fl portion of which is catalytic and synthesizes or hydrolyzes ATP. Mammalian ATP synthase 
complex consists of sixteen different polypeptides (Walker, J. E. and Collinson, T. R. (1994) FEBS 
Lett. 346: 39-43). Six of these polypeptides (subunits alpha, beta, gamma, delta, epsilon, and an ATPase 
inhibitor protein IF 1) comprise the globular catalytic F 1 ATPase portion of the complex, which lies 
outside of the mitochondrial membrane. The remaining ten polypeptides (subunits a, b, c, d, e, f, g, F6> 

35 OSCP, and A6L) comprise the proton-translocating, membrane spanning F0 portion of the complex. 
Like other members of the respiratory chain, all but two of the polypeptide subunits of ATP synthase 
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are nuclear gene products that are imported into the mitochondria. Enzyme complexes similar to 
mammalian ATP synthase are found in all cell types and in chloroplast and bacterial membranes. This 
universality indicates the central importance of this enzyme to ATP metabolism. Transcriptional 
regulation of these nuclear encoded genes appears to be the predominant means for controlling the 
5 biogenesis of ATP synthase. Multiple mitochondrial pathologies exist because of the essential role of 
mitochondrial oxidative phosphorylation in cellular energy production, in the generation of reactive 
oxygen species and in the initation of apoptosis (Wallace, Science, 283: 1482-1488, 1999). It is now 
clear that mitochondrial diseases encompass an assemblage of clinical problems commonly involving 
tissues that have high energy requirements such as heart, muscle and the renal and endocrine systems. 

10 Over the past 1 1 years, a considerable body of evidence has accumulated implicating defects in the 
mitochondrial energy-generating pathway, oxidative phosphorylation, in a wide variety of degenerative 
diseases including myopathy and cardiomyopathy. Most classes of pathogenic mitochondrial DNA 
mutations affect the heart, in association with a variety of other clinical manifestations that can include 
skeletal muscle, the central nervous system (including eye), the endocrine system, and the renal system. 

1 5 Nuclear mutations causing mitochondrial disorders have been described. They are often found in highly 
conserved subunits. Mitochondrial disorders with nuclear mutations include : myopathies (PEO ? 
MNGEE, congenital muscular dystrophy, carnitine disorders), encephalopathies (Leigh, Infantile, 
Wilson's disease, Deafhess-Dystonia syndrome), other systemic disorders and cardiomyopathies. 

The discovery of a new ATP synthase subunit, and polynucleotides encoding it satisfy a need 

20 in the art by providing new compositions which are useful for the diagnosis, prevention, and treatment 
of cancer, myopathies, immune disorders, and neurological disorders. 

It is believed that the protein of SEQ ID NO; 270 or part thereof plays a role in cellular 
respiration, preferably as a mitochondrial ATP synthase subunit. Preferred polypeptides of the 
invention are fragments of SEQ ID NO: 270 having any of the biological activity described herein. 

25 An object of the present invention are compositions and methods of targeting heterologous 

compounds, either polypeptides or polynucleotides to mitochondria by recombinantly or chemically 
fusing a fragment of the protein of the invention to an heterologous polypeptide or polynucleotide. 
Preferred fragments are signal peptide, amphiphilic alpha helices and/or any other fragments of the 
protein of the invention, or part thereof, that may contain targeting signals for mitochondria 

30 including but not limited to matrix targeting signals as defined in Herrman and Neupert, Curr. 
Opinion Microbiol. 3:210-4 (2000); Bhagwat et al. J. Biol. Chem. 274:24014-22 (1999), Murphy 
Trends Biotechnol. 15:326-30 (1997); Glaser et al. Plant Mol Biol 38:3 ! 1-38 (1998); Ciminale et al. 
Oncogene 18:4505-14 (1999). Such heterologous compounds may be used to modulate 
mitochondria's activities. For example, they may be used to induce and/or prevent mitochondrial- 

35 induced apoptosis or necrosis. In addition, heterologous polynucleotides may be used for 
mitochondrial gene therapy to replace a defective mitochondrial gene and/or to inhibit the 
deleterious expression of a mitochondrial gene. 
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The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which mitochondrial respiratory 
electron transport chain is impaired, including but not limited to mitochondriocytopathies, necrosis, 
aging, myopathies, cancer and neurodegenerative diseases such as Alzheimer's disease, 
5 Huntington's disease, Parkinson's disease, epilepsy, Down's syndrome, dementia, multiple sclerosis, 
and amyotrophic lateral sclerosis. For diagnostic purposes, the expression of the protein of the 
invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 
methods described herein and compared to the expression in control individuals. For prevention 
and/or treatment purposes, the protein of the invention may be used to enhance electron transport 

10 and increase energy delivery using any of the gene therapy methods described herein or known to 
those skilled in the art. 

In another embodiment, The invention further relates to methods and compositions using the 
protein of the invention or part thereof to diagnose, prevent and/or treat several disorders in which 
mitochondrial respiratory electron transport chain needs to be impaired, including but not limited to 

1 5 Sjogren's syndrome, Addison's disease, bronchitis, dcrmatomyositis, polymyositis, glomerulonephritis, 
diabetes mellitus, emphysema, Graves' disease, atrophic gastritis, lupus erythematosus, myasthenia 
gravis, multiple sclerosis, autoimmune thyroiditis, ulcerative colitis, anemia, pancreatitis, scleroderma, 
rheumatoid and osteoarthritis, asthma, allergic rhinitis, atopic dermatitis, dermatomyositis, 
polymyositis, and gout, using any techniques known to those skilled in the art including the antisense or 

20 triple helices strategies described herein. 

Moreover, antibodies to the protein of the invention or part thereof may be used for 
detection of mitochondria organelles and/or mitochondrial membranes using any techniques known 
to those skilled in the art. 

Protein of SEP ID NO: 271 (internal designation 1 17-O01-5-O-G3-CS) 

25 The protein of SEQ ID NO: 271 is homologous to the family of lipopolysaccharide (LPS) 

binding proteins (LBPs). Several families of proteins have the ability to bind LPS including (a) the 
lipopolysaccharide-binding proteins (LBPs), and (b) the bactericidal permeability-increasing 
proteins (BPIs). Cholesteryl ester transfer protein (CETP), which is involved in the transfer of 
insoluble cholesteryl esters in reverse cholesterol transport, shares some homology to members of 

30 the LPS binding family of proteins. 

Lipopolysaccharide (LPS), alternatively known as bacterial endotoxin, is a major component of 
the outer membrane of Gram-negative bacteria. It consists of serotype-specific O-side chain 
polysaccharides linked to a core oligosaccharide and Lipid A. LPS is a potent mediator of the 
inflammatory response and stimulates the expression of many pro-inflammatory and pro-coagulant 

35 compounds in monocytes, macrophages, and endothelial cells. While these responses are important in 
containing and eliminating localized infections, systemic exposure to LPS can lead to a number of 
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adverse effects. These include: (a) induction of an inflammatory cascade, (b) damage to the 
endothelium, (c) widespread coagulopathies, and (d) organ damage. 

Systemic exposure to LPS can arise from direct infection by Gram-negative bacteria, leading to 
the complications of Gram-negative sepsis. Examples of diseases which are associated with Gram- 
5 negative bacterial infection or endotoxemia (including bacterial meningitis, neonatal sepsis, cystic 
fibrosis, inflammatory bowel disease, and liver cirrhosis), Gram-negative pneumonia, Gram-negative 
abdominal abscess, hemorrhagic shock, and disseminated intravascular coagulation. Subjects who are 
leukopenic or neutropenic, including subjects treated with chemotherapy or immunocompromised 
subjects, are particularly susceptible to bacterial infection and the subsequent effects of endotoxin 
10 exposure. 

Gram-negative sepsis remains one of the primary causes of severe systemic inflammation in 
hospitalized and immunocompromised patients. Alternatively, changes in gut permeability by a variety 
of circumstances, including trauma, can lead to translocation of bacteria/LPS into the bloodstream. 
Bacteria translocated from the gut is thought to play a major role in post-surgical immunosuppression 

15 (Little et ah, Surgery. 1 14: 87-91 (1993)) and hemorrhagic shock. Therefore, there is a great interest to 
characterize proteins involved in the biological response to LPS and to discover therapies that can 
counteract the effects of LPS in pathological situations. 

LBP is a 60 kDa glycoprotein synthesized in the liver and present in normal human serum. 
LBP expression is upregulated in response to infectious, inflammatory, and toxic mediators. LBP 

20 expression has been induced in animals challenged with LPS, silver nitrate, turpentine, and 
Corynebacterium parvum (Geller et al., Surgery 128:22-28 (1993), Gallay et al., Infect. Immun. 
61:378-383 (1993); Tobias et al., J. Exp. Med. 164:77-793 (1986)). LBP levels are correlated with 
exposure to LPS, and elevated levels (particularly persistent elevated levels) have been correlated with 
poor clinical outcomes in septic patients (U.S. Patent Nos. 5,484,705, and 5,804,367, hereby 

25 incorporated by reference in their entirety). 

A portion of the LBP molecule (the N-terminal 1-197 aa) binds to the lipid A portion of the 
LPS molecule to form a high affinity LBP/LPS complex (Tobias, et al., J. Biol. Chem 264: 10867- 
10871 (1989)). The LBP/LPS complex potentiates the cellular response to LPS via an interaction 
with the monocytic differentiation antigen CD14 (Wright et al., Science. 249: 1431-1433 (1990); 

30 Lee et al., J. Exp. Med. 175:1697-1705 (1992)). LPS can be transferred from LBP to membrane- 
bound or soluble CD 14. Activated CD14 can then interact with endothelial cells to elicit an 
inflammatory response. The C-termmal portion of LBP is required to transfer LPS to CD 14 (U.S. 
Pat. No. 5,731,415; Theofan et al., J. Immunol. 152:3624-29 (1994); Han et al., J. Biol. Chem. 
269:8172-75 (1994)). Evidence also suggests that LBP can neutralize LPS by an interaction with 

35 serum lipoproteins or through the internalization of an LBP/LPS/CD14 complex by neutrophils 
(Wurfel et al., J. Exp. Med. 180:1025-1035 (1994); Wurfel et al., J. Exp. Med. 181:1743-54 (1995); 
Gegner et al., J. Biol. Chem. 20:5320-5325 (1995)). 
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The subject invention provides the polypeptide of SEQ ID NO: 271 and polynucleotide 
sequences encoding the amino acid sequence of SEQ ID NO: 271. In a one embodiment, the 
polypeptides of SEQ ID NO: 271 are interchanged with the polypeptides encoded by the human 
cDNA of clone 1 81 -20-3-0-B5-CS. Also included in the invention are biologically active fragments 
5 of the protein of SEQ ID NO: 271 and polynucleotide sequences encoding these biologically active 
fragments. In a preferred embodiment, biologically active fragments of SEQ ID NO: 271 are 
encoded by clone 181-20-3-0-B5-CS and comprise the first 181 amino acids encoded by clone 181- 
20-3-0-B5-CS. "Biologically active fragments" are defined as those peptide or polypeptide 
fragments of SEQ ID NO: 271 which have at least one of the biological functions of the full length 

10 protein (e.g., the ability to bind bacterial LPS). 

The invention also provides variants of SEQ ID NO: 271. These variants have at least 
about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 271. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 271, 

15 such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 
practiced utilizing the polypeptide of SEQ ID NO: 271 or variants thereof. Likewise, the methods 
of the subject invention can be practiced using biological fragments of the protein of SEQ ID NO: 
or variants of said biologically active fragments. 

20 Because of the redundancy of the genetic code, a variety of different DNA sequences can 

encode SEQ ID NO: 271. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same sequence" refers to sequences that have amino 

25 acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: are also included 
in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
30 code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

The protein of SEQ ID NO: 271, and variants thereof, can be used to produce antibodies 
according to methods well known in the art. The antibodies can be monoclonal or polyclonal. 
35 Antibodies can also be synthesized against fragments of SEQ ID NO: 271, as well as variants 
thereof, according to known methods. The subject invention also provides antibodies which 
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specifically bind to biologically active fragments of SEQ ID NO: 271 or biologically active 
fragments of SEQ ID NO: 27 1 variants. 

The subject invention also provides for immunoassays which are used to screen for, 
monitor, or diagnose exposure to LPS. In one embodiment, diagnostic assays measure the level of 
5 LBP in patient plasma samples. LBP levels are known to rise in response to exposure to LPS, thus 
the measurement of the level of the protein of SEQ ID NO: 271 can provide an early indication of 
Gram-negative infection or of endotoxin exposure. 

The subject invention provides methods of treating individuals infected with Gram negative 
bactena comprising the administration of therapeutically-effective compositions comprising SEQ 

10 ID NO: 271. In one embodiment, the protein lacks the C-terrninal portion (or portions of the C- 
terminal domain) necessary to transfer LPS to CD 14. LPS can be scavenged by the excess N- 
terminal fragment and would be unable to induce an inflammatory response (see, e.g., U.S. Patent 
No. 5,731,415, hereby incorporated by reference in its entirety). 

Another aspect of the subject invention provides methods of prophylaxis. The method 

15 treats individuals by administration of therapeutically-effective amounts of compositions 
comprising SEQ ID NO: 271. Instances where this aspect of the invention can be performed 
include, but are not limited to, conditions associated with increased translocation of gut bacteria and 
endotoxin, particularly prior to surgery. In addition, patients who are at risk for potential Gram- 
infection, including but not limited to patients undergoing chemotherapy, or patients who are 

20 immunocompromised (for example with AIDS) can benefit from such treatment. Such uses are 
described in U.S. Patent No. 5,990,082, hereby incorporated by reference in its entirety. 

The N-terminal portion of LBP, which lacks the ability to induce an inflammatory response, 
can be fused to other proteins or fragments thereof (such as the bactencidal/permeability-increasing 
protein or BPI) which can increase the association of these molecules with LPS and aid in the 

25 clearance of endotoxin from patients who have been exposed to Gram negative bacteria. Such 
preparations can be used to treat and inhibit a number of Gram-negative infections, Gram positive, 
or fungal infections, as described in the following patents: WO 95/19179 A, WO 95/19180 A, WO 
95/19372 A, and WO 96/34873 A, each of which is incorporated by reference in its entirety. 

The subject invention also provides methods of removing endotoxin from recombinantly- 

30 produced proteins. In one embodiment, the recombinantly-produced proteins are obtained from 
Gram-negative bactena. In a preferred embodiment, the bacteria are E. coli. In another 
embodiment, the protein of SEQ ID NO: 271, biologically active fragments thereof, variants, or 
derivatives thereof, are contacted with compositions comprising recombinantly-produced proteins. 
The contacting step can take place with SEQ ID NO: 271 immobilized on a substrate or with SEQ 

35 ID NO: 271 present in free solution. 

In addition, protein of SEQ ID NO: 271, biologically active fragments, or derivatives 
thereof, can be used in diagnostic assays to measure the level of LPS in patient plasma samples. In 
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such an assay, serum samples would be bound to a solid matrix, such as a membrane, plastic, 
treated plastic, or other supports, and then cloned with the protein of SEQ ID NO: 27 1 . 
Visualization can be achieved by fusing protein of SEQ ED NO: to any number of enzymes 
followed by treatment with a chromogenic, fluorogenic, or luminescent substrate. Alternatively, the 
5 protein of SEQ ID NO: 27 1 , biologically active fragments, variants, or derivatives thereof, can be 
linked to a fluorescent or luminescent protein or compound. The linkage can be chemical or made 
by recombinant techniques known to those skilled in the art. In addition, antibodies raised against 
the protein of SEQ ID NO: 271, biologically active fragments, variants, or derivatives thereof can 
be used to visualize the LPS/protem 271 complexes using immunoassays known to those skilled in 
10 the art. 

Protein of SEQ ID NO:266 (internal d esignation 1 16>1 10-2 -Q-F4-CS) 

The protein of SEQ ID NO:266, highly expressed in the testis, is encoded by cDNA of SEQ 
ID NO:25 and exhibits homology to the Ly-6 family of GPI-linked cell-surface glycoproteins 
composed of one or more copies of a conserved domain of about 100 amino-acid residues 

15 (PS00983; LY6JUPAR ). 

The protein of SEQ ID NO:266 shows significant structural similarities to mouse Ly-6 
antigens, human CD59 and a herpes virus CD59 homolog. The protein of SEQ ID NO:266 
displays one copy of the motif of the u-PAR/Ly-6 domain, with all ten extracellular cysteine 
residues conserved. The mature protein sequence contains a relatively high proportion of cysteine 

20 residues ( 10/105), which suggests that numerous disulfide bonds stabilize its tertiary structure. 
Furthermore, the 124 amino-acid long protein of SEQ ED NO:266 has a size very similar to that of 
many members of the Ly-6 family. In addition, the protein of the invention has a predicted signal 
peptide structure (positions from 1 to 1 9) and a C-terminal hydrophobic fragment (positions from 
101 to 121) necessary for GPI-anchoring in a membrane. Thus, the protein of the invention has a 

25 clear evolutionary relationship with the Ly-6/uPAR family, particularly with Ly-6 subfamily. 

The Ly-6/uPAR protein family members share one or several repeat units of the Ly-6/uPAR 
domain, which is defined by a distinct disulfide bonding pattern between 8 or 10 cysteine residues. 
This family can be divided into two subfamilies. One comprises GPI-anchored glycoprotein 
receptors with 10 cysteine residues. Another subfamily includes the secreted single-domain snake 

30 and frog cytotoxins, and differs significantly in that its members generally possess only eight 
cysteines and no GPLanchoring signal sequence (Andermann K, et al. Protein Sci 8(4):810-819 
(1999)). The Ly-6 family members are low molecular weight phosphatidyl inositol anchored 
glycoproteins with remarkable amino acid homology throughout a distinctive cysteine rich protein 
domain that is associated predominantly with O-linked carbohydrate. Their GPI links are necessary 

35 to anchor these cell surface proteins to the outside of the lipid bilayer membrane. The Ly-6 family 
includes human CD59, which protects from complement-mediated membrane damage, squid Sgpl 
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and Sgp2, urokinase plasminogen activator receptor, murine Sca-1 and Sca-2, and many other 
proteins. The general structure seen within the Ly-6 family resembles that of the receptor for a 
urokinase-type plasminogen activator and the alpha- neurotoxins from snake venoms (Fleming T J 
et al J Immunol 150:5379-5390 (1993); Ploug M and V Ellis FEBS Lett 349:163-168 (1994)). 
5 The Ly-6 cell surface proteins are differentially expressed in several hematopoietic lineages 

that appear to function in signal transduction and cell activation predominantly on lymphoid cells in 
the mouse. Analyses using anti-Ly-6A/E monoclonal antibodies has also demonstrated in situ 
expression of Ly-6 molecules in brain tissue (staining primary associated with vascular elements 
throughout the brain). These proteins do not appear to be expressed during embryonic or neonatal 

10 stages of development (Cray C et al. Brain Res Mol Brain Res 8(1):9-15 (1990)). 

Ly-6 protein expression has been shown to be factor-dependent. For example, the 
expression of the Ly-6A/E, which normally occurs in hemopoietic stem cells, fibroblasts, and T and 
B lymphocytes, has been shown to be greatly induced by IFN-ft in various tissues and ceil lines. In 
addition, the Ly-6E Ag is associated with tyrosine kinases in T cells, and reduced expression of Ly- 

15 6E in T cells impairs normal functional responses, as well as tyrosine kinase activity, in these cells. 
Further, the IFNs are important in the generation of memory CD8+ T cells, and it has been 
demonstrated that the expression of Ly-6C Ag is a strong marker for the memory phenotype 
(Mehran M. et al. Journal of Immunology 163: 81 1-819 (1999)). Like their murine counterparts, a 
human homologue of Ly-6 genes, the 9804 gene, is responsive to IFNs. The 9804 gene is also 

20 inducible by retinoic acid dunng differentiation of acute promyelocytic leukemia cells. Further, 
cultured glial and neuronal cells express high levels of Ly-6A/E following incubation with 
cytokines, including rIFN-gamma. (Cray C et al. Brain Res Mol Brain Res 8(1):9-15 (1990)). 
Another member of the Ly-6 family, human protein RoBo-1, shows increased expression in 
response to two modulators of bone metabolism, estradiol and intermittent mechanical loading, 

25 suggesting a role in bone homeostasis (Noel LS et al. J Biol Chem, Vol. 273(7): 3878-3883 (1998)). 
Such factor-dependence of expression makes Ly-6 proteins either candidates or targets for 
alloresponses and autoimmune disease. For example, the high level factor-induced expression of 
LY-6s has been associated with lupus nephritis (Blake P G et al. J Am Soc Nephrol 4: 1 140-1 150 
(1993)). 

30 Murine Ly-6 molecules have interesting patterns of tissue expression during 

haematopoiesis, from multipotential stem cells to lineage committed precursor cells, and on specific 
leukocyte subpopulations in the peripheral lymphoid tissues. These patterns suggest an intimate 
association between the regulation of Ly-6 expression, and the development and homeostasis of the 
immune system (Gumley TP et al. Immunol Cell Biol 73(4):277-296 (1995)). Ly-6M messenger 

35 RNA (mRNA) is easily detectable in hematopoietic tissue (bone marrow, spleen, thymus, peritoneal 
macrophages) as well as kidney and lung (Patterson JM et al. Blood 95(10): 31 25-3 132 (2000)). 
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Normally, human blood cells are protected against autologous complement activation by 
membrane proteins that block the assembly of functional complement pores. One such protein is 
human Ly-6 CD59. Administration of CD59 prevents hemolytic disease or thrombosis. Further, 
the CD59 protein may prevent the complement-mediated lysis and activation of endothelial cells 
5 that leads to hyper acute rejection, and therefore may be administered during xenogeneic organ 
transplantation (Binette, J. P. and Binette, M. B., Scanning Microcs., 7: 1 107-10 (1993)). 

The surface receptor for urokinase plasminogen activator (uPAR) has been recognized in 
recent years as a key molecule in regulating plasminogen mediated ex. trace llular proteolysis. 
Surface plasminogen activation controls the connections between cells, basement membrane and 
10 extracellular matrix, and therefore the capacity of cells to migrate and invade neighboring tissues 
(Roldan AL et al. EMBO J 9(2):467-474 (1990)). Certain factors of the PA system, such as u-PAR, 
have been detected in organs of the male reproductive tract in various species. The morphological 
study provide support for the involvement of the PA system in human male reproductive physiology 
(Gunnarsson M et al. Mol Hum Reprod 5(10):934-940 (1999)). 
1 5 LY-6 proteins have been suggested to play important roles in disorders such as cancers, 

nephopathies, autoimmune diseases, hemolytic disease, thrombosis, Alzheimer's disease, etc. 
Several members of the murine Ly-6 supergene family are clearly involved in the progression of 
certain mouse tumors, as their expression level is higher in highly malignant cells than in tumor 
cells with a lower malignancy phenotype. Sorting by flow cytometry of tumor cells to 
20 subpopulations expressing either high or low levels of Ly-6E. 1 yielded cells expressing a high or a 
low malignancy phenotype, respectively. Further, it was shown that LY-6 is highly expressed on 
non-lymphoid tumor cells originating from a variety of tissues in mice. Upregulation or high 
expression is correlated with a more malignant phenotype which results in higher efficiency of local 
tumor production (Katz et al Int J Cancer 59:684-91 (1994)) . 
25 Cells derived from angiogenic tumors express a higher tumorigenicity phenotype and a 

higher capacity to produce artificial pulmonary metastases than cells from the poorly angiogenic 
tumors. These cells also express significantly higher levels of the lymphocyte activation protein 
Ly-6E, so the angiogenic phenotype appears to be coregulated with Ly-6 (Sagi-Assif O et al. 
Immunol Lett 54(2-3):207-1 3 (1996)). Some LY-6 proteins also block secretion of interleukin II 
30 (IL-2) which is an approved anticancer agent and a key regulatory hormone in cell-mediated 
immunity (Fleming T J and T R Malek J Immunol 153: 1955-62 (1994)). IL-2 stimulates the 
proliferation of both T and natural killer cells and activates NK cells which can directly lyse freshly 
isolated, solid tumor cells. 

The high malignancy, high Ly-6E. 1 -expressing cells also expressed high levels of the 
35 receptor for urokinase plasminogen activator (uPAR), whereas low malignancy, low Ly-6E. 1 - 

expressing cells also expressed low levels of uPAR. Transfection studies have indicated that uPAR 
is causally involved in conferring a high malignancy phenotype upon tumor cells expressing high 
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levels of Ly-6E. I . E48, a human homologue of the murine ThB Ly-6 protein, is expressed on head 
and neck squamous carcinoma cells. In E48-stimulated cells, the binding of E48 to its 
microenvironmental ligand appears to transduce a signal that up-regulates the expression of the FX 
enzyme in these cells, leading to an increase in the levels of GDP-L-fucose (Rinat Eshel et al. J 
5 Biol Chem, Vol. 275(17):12833-12840 (2000)). A congenital disorder of leukocyte adhesion to 
vascular endothelium termed LADII is reflected in a generalized fucose deficiency and major 
defects in leukocyte trafficking and function. Ly-6 loss-variants of a murine tumor exhibit 
alterations in the incorporation of fucose and mannose into cellular glycoconjugates (Witz IP J. 
Cell. Biochem. Suppl. 34:61-66 (2000)). 

10 It is believed that the protein of SEQ ID NO:266 is a novel member of the Ly-6 protein 

family, and is thus a specific cell-surface glycoprotein antigen involved in signal transduction and 
cell activation, proliferation and differentiation. Preferred polypeptides of the invention are 
polypeptides comprising the amino acids of SEQ ID NO:266 from position 1 to position 18 and 
from position 19 to position 124. Other preferred polypeptides of the invention are any fragments 

15 of SEQ ID NO:266 having any of the biological activities described herein. 

In one embodiment, this invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably testis. For 
example, the protein of the invention or part may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 

20 identify tissues of unknown origin, such as forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. 

Another embodiment of the present invention relates to methods of using of the protein of 
the invention or part thereof and related compounds and derivatives to diagnose developmental and 

25 malignant disorders in tissues including urogenital tissues and other tissues of the reproduction 

system of both sexes. For example, a biological sample is obtained from a patient with cancer or at 
risk of developing cancer, and the level of SEQ ID NO:25 polynucleotides or encoded polypeptides 
is detected within the cells of the sample. The detection of an elevated level of the SEQ ID NO:25 
polynucleotides or encoded polypeptides in the sample relative to a control level indicates the 

30 presence of malignant cells within the patient. The expression of the protein of the invention can be 
investigated using any of a number of methods, including, but not limited to, Northern blotting, RT- 
PCR or immunoblotting, 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention or part thereof in recombinant protein form as pharmacological agents in the 

35 treatment of developmental and malignant disorders in tissues including urogenital tissues and in 
other tissues of human reproduction system. Particulary, the protein of the invention or part thereof 
can be used in the treatment of disorders which are manifested by male sterility. 
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In another embodiment of the invention, antibodies which bind to the protein of the 
invention or part thereof are used in the treatment of tumors, e.g., human urogenital tumors, 
especially to enhance the secretion of interleukin II, which is an approved anticancer agent and key 
regulatory hormone in cell-mediated immunity. Such antibodies can be used alone or bound to a 
5 substance capable of ablating or killing cells as a therapy for urogenital disorders or cancers in 
which the protein of the invention is overexpressed. 

The protein of the invention or part thereof may also be used in the treatment of diseases 
which can require transplantation, including various forms of cancers such as genitourinary cancers, 
carcinomas, sarcomas, atherosclerosis, angiogenesis, and benign tumors. As mentioned above, Ly- 

10 6 family includes several proteins which are similar to the protein of the invention and which are 
capable of protecting cells from complement-mediated membrane damage. Therefore, in another 
embodiment of the invention, recombinant proteins encoded by SEQ ID NO:25 or fragments 
thereof are administered during xenogeneic tissue transplantation to prevent complement-mediated 
lysis and to block activation of endothelial cells, which normally leads to hyper-acute rejection. 

15 In addition, prevention of complement-mediated lysis may be particulary important in 

human and animal reproductive therapy, where functional survival of the germ cells during in vitro 
handling is crucial. Storage of sperm is of widespread importance in commercial animal breeding 
programs, human sperm donor programs, and in the treatment of certain disease states. For 
example, sperm samples may be frozen for men who have been diagnosed with cancer or other 

20 diseases that may eventually interfere with sperm production, as well as for assisted reproduction 
purposes where sperm may be stored for use at other locations or times. The procedures utilized in 
such cases include: washing a sperm sample to separate out the sperm-rich fraction from non-sperm 
components of a sample such as seminal plasma or debris; further isolating the healthy, motile 
sperm from dead sperm or from white blood cells in an ejaculate; freezing or refrigerating of sperm 

25 for use at a later date or for shipping to females at differing locations; extending or diluting sperm 
for culture in diagnostic testing or for use in therapeutic interventions such as in vitro fertilization or 
intracytoplasmic sperm injection (Cohen et al. 12 : 994-1001 (1997)). Once sperm have been 
washed or isolated, they are then extended (or diluted) in culture or holding media for a variety of 
uses (sperm analysis, diagnostic tests, assisted reproduction). Each of these uses for extended or 

30 diluted sperm requires a somewhat different formulation of basal medium (see, for review, US 

Patent No. 6,140,121 Ellington et al. Oct. 2000); however, in all cases sperm survival is suboptimal 
outside of the female reproductive tract. Novel additional components of a dilution or storage 
medium which could improve the functional preservation of sperm would be useful. Therefore, in 
another preferred embodiment of this invention, purified recombinant proteins encoded by SEQ ID 

35 NO:25 or fragments thereof can be added as components of pharmacological media designed to 
protect spermatozoa. The methods used to compose such preservation media are generally known 
by those skilled in the art (for ex., Oliver S.A . et al. US patent 5,897,987 Apr.1999; Cohen J. et al., 
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supra). Inversely, in yet another embodiment of this invention, ligands, inhibitors, neutralizing 
antibodies or other biological agents which recognize the protein of the invention and which bind it 
and which block it can be used as components of pharmacological formulations designed for male 
contraception purposes. 

5 In still another embodiment of this invention, chimeric ligands or derivatives which 

recognize the protein of the invention or part thereof and which could be internalized into cell can 
be used to design a system of drug delivery finely targeted toward urogenital and other tissues 
which express the protein of SEQ ID NO:266. For example, such recognizing molecules can be 
incorporated into the membranes of liposomes to allow the specific delivery of the liposomes to 
10 cells expressing the protein of SEQ ID NO:266. Methods of designing such drug delivery systems 
are known by those skilled in the art (Smith H.J. Introduction to the principles of drug design and 
action, 3 rd ed. (1998)). 

Proteins SEQ ID NOs:417, 413, 418 ( i nternal designations 1 88-45-1 -0-D3-CS, 1 88-26-4-0-F5-CS, 
and 188-5-1-0-H6-CS) 

15 The proteins of SEQ ID NOs:417, 413, and 418, encoded by the cDNAs of SEQ ID NOs: 

176, 172, and 177, are expressed in the brain and exhibit strong homology with proteins with redox 
activity (see, e.g. Genbank accession numbers AK001293 and AF029689, and Geneseqp accession 
number: Y59180). 

The protein of SEQ ID No:418 (320 amino acids) is a variant of AK00 1293 (322 amino 

20 acids). AK001293 has six extra nucleotides, within the same ORF, as SEQ ID No:418, producing a 
longer protein. SEQ ID NO:418 exhibits the Pfam Zinc-binding dehydrogenase (adh zinc) 
signature from positions 16 to 313. SEQ ID NO:418 presents all the conserved residues of the 
motif except for a histidine that is thought to be a zinc-ligand. This lack of zinc-ligand residues is a 
feature of the quinone oxidoreductases (QOR), a subfamily of zinc-binding dehydrogenases. 

25 SEQ ID NO:413 (191 ammo acids) shares the first 172 amino acids with SEQ ID NO:418. 

The deletion of one nucleotide at position 583 in the SEQ ID NO:4l3 cDNA sequence 
(corresponding to amino acid 173), however, creates a change of ORF compared to SEQ ID 
NO:418 and AK001293. 

SEQ ID NO:41 7 is a short protein (20 amino acids) whose sequence corresponds to the N- 

30 terminal end of the other proteins of the invention. The presence of a T (instead of a G in public 
sequences and SEQ ID NOs:413 and 418) at position 128 on the cDNA creates a STOP codon, 
creating a shorter protein. 

SEQ ID NOs:417, 413 and 418 are similar to the QORs, a family of zinc-binding 
dehydrogenases. QORs are cytoplasmic redox-regulated flavoenzymes that catalyze the one or two- 

35 electron reduction of quinones. QORs bind NADP and are inhibited by dicoumarol. 
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The activity of QORs protects cells against toxicity, mutagenicity, and cancer due to 
exposure to environmental and synthetic quinones and their precursors. Thus, QORs play a central 
role in monitoring cellular redox state and act to protect against oxidative stress induced by a 
variety of metabolic situations (Raina A.K. et al. (1999) Redox Rep. 4:23-7). The oxidoreductase 
5 activity also permits the activation of bioreductive anticancer drugs (Begleiter A. et al. (1996) Br. J. 
Cancer Suppl. 27:S9-14). 

The metabolism of quinones involves enzymatic reduction of the quinone by one or two 
electrons. Jn the activation of quinone-containing antitumor agents, this reduction results in the 
formation of the semiquinone or the hydroquinone of the anticancer drug. The consequence of 
10 these enzymatic reductions is that the semiquinone yields its extra electron to oxygen with the 
formation of superoxide radical anion and the original quinone. This reduction by a reductase 
followed by oxidation by molecular oxygen (dioxygen) is known as redox-cy cling and continues 
until the system becomes anaerobic. In the case of a two-electron reduction, the hydroquinone 
could become stable, and as such, be excreted by the organism in a detoxification pathway. 
1 5 The cellular antioxidant response is mediated by a battery of detoxifying/defensive proteins. 

The promoters of genes that encode these proteins contain a common cis-element termed the 
antioxidant response element (ARE). Many transcription factors, including Nrf, Jun, Fos, Fra, Maf, 
YABP, ARE-BP1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor bind to the ARE 
from various genes. Among these factors, Nrf- Jun heterodimers positively regulate ARE-mediated 
20 expression and induction of genes in response to antioxidants and xenobiotics (reviewed in 
Dhakshinamoorthy S. et al. (2000) Curr. Top Cell Regul. 36:201-16). On the other hand, c-Fos 
represses ARE-mediated gene expression (Venugopal, R., and Jaiswal, A.K. (1996) Proc. Natl. 
Acad. Sci. USA 93, 14960-5). 

Elevated levels of QOR activity have been reported in several kinds of tumors such as liver, 
25 colon, lung and breast (Belinsky ML Jaiswal A.K., (1993) Cancer Metastasis Rev 12; 103-17). 
Bioreactive antitumor agents are an important class of anticancer drugs that require activation by 
reduction. For this reason, QORs are a potential target on which to base the development of new 
antitumor compounds. Certain QORs have already been implicated in the metabolism, activation 
and mechanism of cytotoxicity of some anticancer drugs such as mitomycin C, indoloquinone E09 ( 
30 Ross D. et al. (1994) Oncol. Res. 6:493-500), CB 1954 (Knox R.J. et al. (2000) Cancer Res. 
60:4179-86) or antiestrogens in breast cancer (Montano M.M., Katzenellenbogen B.S. (1997) 
PNAS 94:2581-6). 

In addition, some of the proteins of the QOR family are thought to play a role in the 
prevention of apoptosis following oxidative stress. The rumor suppressor gene p53 has been 
35 directly implicated in the induction of apoptosis in dividing cells and in hippocampal pyramidal 
neurons (Jordan J. et al. (1997) J. Neurosci 17: 1397-405) and a QOR gene has been described as a 
p53-regulated gene (Kostic C, Shaw P.H. (2000) Oncogene 19:3978-87). 
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It is believed that the proteins of SEQ ID NOs:417, 413 and 418 have a redox activity, most 
likely as QORs. Thus, they are expected to act as an endogenous antioxidant against oxidative 
stress and may be able to use NADP as cofactor. The proteins of the invention may be used to 
deactivate toxins and to activate bioreductive anticancer drugs. In addition, they may prevent 
5 apoptosis following oxidative stress and be regulated by p53. Because proteins SEQ ID NOs:41 7 
and 413 do not contain the Pfam Zinc -binding dehydrogenase (adh zinc) signature, in contrast to 
SEQ ID NO:418, they may act as a competitive inhibitor, i.e. a dominant negative form, of the 
functional protein. 

The oxidoreductase activity of the proteins of the invention may be assayed using any 
10 technique known to those skilled in the art. For example, the measurement of the rate of oxidation 
of NADPH and oxygen consumption, and the detection of the semiqumone and reactive oxygen 
species, may be performed as described by Gutierrez P.L. (Gutierrez P.L . (2000) Front. Biosci. 
5:D629-38), or by any other method skilled in the art. The enzymatic activity of the proteins of the 
invention in different affected and control tissues may be assayed by histochemical staining. To 

1 5 confirm the role of the proteins of the invention in the cellular antioxidant response, in vitro and in 
vivo assays may be performed- Transcription levels of the genes coding for the proteins of the 
invention may be measured using standard techniques after exposure to quinones or derived 
compounds as beta-naphtoflavone (beta-NF), as described by Belinsky M. and Jaiswal A.K. (supra), 
as well as in response to transcription factors such as Nerf, Jun and c-Fos, or in the presence of p53. 

20 In one embodiment of the present invention, the present protein can be used to detect 

specific cell types in vitro or in vivo. For example, as the present proteins are overexpressed in the 
brain, reagents capable of specifically recognizing the present protein can be used as markers for 
brain cells. Brain-specific markers have a number of uses, including for the identification of 
specific tissues for histological analyses, as well as to detect the origin of tumor cells. In addition, 

25 as the expression of the present protein is likely induced by transcription factors such as Nrf, Jun, 
Fos, Fra, Maf, YAJBP, ARE-BP 1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor, as 
well as by p53, reagents specific for detecting the present protein can also be used as a marker for 
the activity of any of these proteins in vitro or in vivo. In view of the association between many of 
these proteins and diseases such as cancer, the ability to detect the presence or absence of the 

30 proteins provides powerful tools for disease diagnosis and screening. For any of these applications, 
the expression of the present protein can be detected using any standard method, including Northern 
blots, western blots, in situ hybridization, PCR, etc. 

In another embodiment, the proteins of the invention can serve as markers for cellular 
oxidative stress in vivo and in vitro. As such, the proteins of the invention or part thereof may be 

35 useful in the diagnosis of disorders in which oxidative stress is implicated, including a large variety 
of types of cancer as well as neurodegenerative disorders such as Alzheimer's disease (AD), 
amyothropic lateral sclerosis (ALS) or Parkinson disease (PD). For diagnostic purposes, the 
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expression of the protein of the invention may be investigated using, e.g. Northern blotting, RT- 
PCR or imrnunoblotting methods and compared to the expression in control individuals. An 
increased levels of the proteins of the invention in patients compared with controls indicates a major 
shift in redox balance and, thus, indicates the presence of the disease or of a susceptibility for the 
5 disease. 

The invention further relates to methods and compositions using the proteins of the 
invention or part thereof to prevent and/or treat disorders in which oxidative stress is implicated, 
including those mentioned above. For these purposes the proteins themselves, or polynucleotides 
encoding the proteins, or an activator of protein expression may be administrated to patients, or to 
1 0 disease-free individuals in case of increased susceptibility to one of these disorders. 

In another embodiment, the protein of the invention or part thereof is used to prevent cells 
from undergoing apoptosis. They may thus be useful in the diagnosis, treatment and/or prevention 
of disorders and processes in which apoptosis is deleterious, including but not limited to immune 
deficiency syndromes (including AIDS), type I diabetes, pathogenic infections, cardiovascular and 

15 neurological injury, alopecia, aging, degenerative diseases including AD and PD, dystonia, Leber's 
hereditary optic neuropathy and schizophrenia. For all such diagnostic purposes, the expression of 
the proteins of the invention can be investigated using any of the Northern blotting, RT-PCR or 
imrnunoblotting methods described herein and compared to the expression in control individuals. 
The invention relates to methods and compositions using the proteins of the invention or 

20 part thereof as detoxifying enzymes against quinones. There are a variety of quinones with a toxic 
effect in cells (e.g. quinones derived from the oxidation of phenolic metabolites of benzene, DA- 
quinones, or menadione). Thus, the proteins of the invention or part thereof may be protective 
against the hematotoxic and carcinogenic effects of benzene, as well as against benzene-caused 
diseases such as cancer, aplastic anemia and pancytopenia. Moreover, they may detoxify DA- 

25 quinones in the brain, thereby providing neuroprotection in Parkinson's Disease. In still another 
embodiment, the proteins of the invention or part thereof may protect cells against menadione- 
induced oxidative stress, with known effects on myocardial cells (Floreani M. et al (2000) Biochem 
Pharmacol. 60:60 1 -5). For prevention and/or treatment purposes the proteins themselves, or 
polynucleotides encoding the proteins, or an activator of protein expression may be administrated. 

30 in another embodiment, the present proteins may be a target of chemotherapy specific to 

different kinds of cancer, to ensure a favorable response to anticancer drugs. Specifically, proteins 
of the invention or part thereof may be used as an activator of cytotoxic prodrugs of qui none family. 
Accordingly, the protein of the invention or part thereof may be administered to a patient in 
conjunction with a bioreductive anticancer agent in order to activate the drug. This co- 

35 administration may be by simultaneous administration, such as a mixture of the oxidoreductase and 
the drug, or by separate simultaneous or sequential administration. Cancer-spec ific antitumor 
agents based on QOR substrates may be designed as described by Xing J. et al. (Xing J. et al. (2000) 
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Med. Chem. 43:457-66) and assayed as described in Li B. et al. (Li B et al. (1999) Chem. Res. 
Toxicol. 12: 1042-9). Alternatively, as the present proteins may be overexpressed in tumor cells, 
such methods may be performed by simply detecting the level of the present protein in tumor cells, 
and administering the prodrug specifically to those patients found to have elevated levels of the 
5 protein in their tumor cells. 

Proteins of SEP ID N Q s: 415, 310. 317 (internal designation 188-29-2-0-H1-CS. 188-18-4-0-A9- 
CS. 188-9-2-0-El-CS^ 

Mammalian inositol hexakiphosphate kinase 2 (IP6K2), an enzyme of the inositol 
phosphate pathway, has been cloned and described by two independent groups [Saiardi, A.; 

10 Erdument-Bromage, EL; Snowman, A, M.; Tempst, P.; and Snyder, S. H., ( 1999) Current Biology 9, 
1323-1326, and Katai, K.; Miyamoto, K-L; Kishida, S.; Segawa, H.; Nii, T.; Tanaka, H.; Tarn, Y.; 
Arai, H.; Tatsumi, S.; Morita, K.; Taketani, Y.; and Takeda, E. (1999) Biochem. J. 343, 705-712]. 
Newly identified consensus sequences of inositol-polyphosphate kinases are represented by [LV]- 
[LA]-[DE]-X(3-8)-P-X-[VAI]-[ML]-D-X-K-[ML]G [Saiardi, A.; Erdument-Bromage, H.; 

15 Snowman, A. M.; Tempst, P.; and Snyder, S. H. (1999) Current Biology 9, 1323-1326]. IP6K2 
catalyzes the transfer of phosphate groups from lnsP6 or Ins(l,3,4,5,6)P5 (the substrate), to another 
protein or small molecule, such as a nucleoside di-phosphate. 

The subject invention provides the polypeptides of SEQ ID NOs:41 5, 310, and 317, 
encoded by the cDNAs of SEQ ID NOs: 174, 69, and 76, respectively. The invention also provides 

20 biologically active fragments of SEQ ID NOs:415, 3 10, and 317. In one embodiment, the 
polypeptides of SEQ ID NOs:415, 310, and 317 are interchanged with the corresponding 
polypeptides encoded by the human cDNA of clone 188-29-2-0-H1-CS, 188-1 8-4-0- A9-CS, or 188- 
9-2-0-E1 -CS. "Biologically active fragments" are defined as those peptide or polypeptide 
fragments having at least one of the biological functions of the full length protein (e.g., kinase 

25 activity). Compositions of the protein/polypeptide of SEQ IDNOs:415, 310, or 317, or biologically 
active fragments thereof, are also provided by the subject invention. These compositions may be 
made according to methods well known in the art. 

The invention also provides variants of the protein of SEQ ID NOs:415, 310, or 317. These 
variants have at least about 80%, more preferably at least about 90%, and most preferably at least 

30 about 95% ammo acid sequence identity to the amino acid sequences encoded by SEQ ID NOs:4 1 5 . 
310, and 317. Variants according to the subject invention also have at least one functional or 
structural characteristic of the protein of SEQ ID NOs:415, 310, or 317. The invention also 
provides biologically active fragments of the variant proteins. Compositions of variants, or 
biologically active fragments thereof, are also provided by the subject invention. These 

35 compositions may be made according to methods well known in the art. Unless otherwise 

indicated, the methods disclosed herein can be practiced utilizing the protein encoded by SEQ ID 
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NO:415, 310, or 317, biologically active fragments of SEQ ID NO:415, 310, or 317, vanants of 
SEQ ID NO:415, 310, or 3 17, and biologically active fragments of the variants. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence of SEQ ID NO:415, 3 10, or 317. In a preferred embodiment, SEQ 
5 IDNO.-415, 310, or 3 17 is encoded by clone 188-29-2-0-H1-CS, 188-18-4-0-A9-CS, or 188-9-2-0- 
El-CS, or by the cDNAs of SEQ ID NO: 174, 69, or 76. It is well within the skill of a person 
trained in the art to create these alternative DNA sequences which encode proteins having the same, 
or essentially the same, amino acid sequence. These variant DNA sequences are, thus, within the 
scope of the subject invention. As used herein, reference to "essentially the same" sequence refers 

10 to sequences that have amino acid substitutions, deletions, additions, or insertions that do not 
materially affect biological activity. Fragments retaining one or more charactenstic biological 
activity of the protein encoded by SEQ ID NO:415, 3 10, or 3 17 are also included in this definition. 

In one aspect of the subject invention, SEQ ID NO:415, 310, or 317, and variants thereof, 
can be used to generate polyclonal or monoclonal antibodies. Both biologically active and 

15 immunogenic fragments of SEQ ID NO:415, 3 10, or 3 17, or variant proteins, can be used to 

produce antibodies. Polyclonal and/or monoclonal antibodies can be made according to methods 
well known to the skilled artisan. Antibodies produced in accordance with the subject invention can 
be used in a variety of detection assays known to those skilled in the art. The antibodies may be 
used to agonize or antagonize the biological activity of the protein of SEQ ID NO:415, 310, or 3 17. 

20 The protein of SEQ ID NO:415, 310, or 317 can be used for the synthesis of nucleoside 

triphosphate (NTP) compounds. In one embodiment, the NTP compound produced is ATP, GTP, 
CTP, or TTP. In this aspect of the subject invention, SEQ ID NO:415, 310, or 317 removes a 
phosphate from InsP6 or Ins(I,3,4,5,6)P5 and transfers it to a nucleoside diphosphate (e.g., ADP, 
CTP, GDP, or TDP) to create a NTP. The conditions and methods for the synthesis of NTP 

25 compounds, such as ATP, are well known to the skilled artisan. Thus, the protein of SEQ ID 
NO:415, 3 10, or 3 17 has industrially useful function for the synthesis of commercially valuable 
products. 

The subject invention also provides methods of determining the relative amounts of InsP6 
or Ins(l,3,4,5,6)P5 in the cell by a kinase assay. In this aspect of the invention, SEQ ID NO:415, 
30 3 10, or 3 1 7 can be used to transfer phospate groups from InsP6 or Ins( 1 , 3,4,5, 6)P5 to acceptor 
substrates according to well-known kinase activity assays. 

Protein of SEQ ID NO:294 (internal designation 181-1 6-2-0-A7-CS) 

The protein of SEQ ID NO:294 is encoded by the cDNA of SEQ ID NO:53. It will be 
appreciated that all characten sties and uses of the polypeptide of SEQ ID NO:294 described 
35 throughout the present application also pertain to the polypeptide encoded by the human cDNA of 
clone 181 -16-2-0- A7-CS. In addition, it will be appreciated that all characteristics and uses of the 
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nucleic acid of SEQ ID NO:53 described throughout the present application also pertain to the 
human cDNA of clone 181-16-2-0-A7-CS. 

This gene was isolated from fetal liver and expression has also been detected in fetal 
kidney, placenta, liver, brain, hypertrophic prostate, salivary gland and testis. Data from PCT 
5 application WO 98/23435 indicate expression is primarily in bone marrow cell lines, and to a lesser 
extent, in human endometrial stromal cells, human adult small intestine and human pancreas tumor. 
PCT application WO 99/14484 reports the fraction of expression in the gastrointestinal system 
(0.227), reproductive system (0. 1 93), and hematopoietic/immune system (0. 1 68). Finally, this 
protein is 55% identical and 76% similar to CGI- 128 protein, which was isolated from CD34+ cells 

10 and is also found in cell lines from the hematopoietic lineage including, HL6 (granulocytic), Jurkat 
(T-lymophocytic), K562 (erythro-megakaryocytic), and U937 (monocytic). 

Supernatant harvested from cells expressing the product of this gene has been shown to 
increase the permeability of the plasma membrane of renal mesangial cells to calcium. Thus, it is 
believed that the product of this gene is involved in activating a signal transduction pathway when it 

15 binds a receptor on the surface of the plasma membrane of both mesangial cells and other cell types, 
in addition to other cell-lines or tissue cell types. Thus, polynucleotides and polypeptides have 
uses, which include, but are not limited to, activating mesangial cells by contacting said cells with a 
full length polypeptide or a polypeptide fragment which demonstrates this biological activity. 
Further, the polynucleotides and polypeptides can be used in the methods described in W09915652, 

20 incorporated in its entirety. Binding of a ligand to a receptor is known to alter intracellular levels of 
small molecules, such as calcium, potassium and sodium, as well as alter pH and membrane 
potential. Alterations in small molecule concentration can be measured to identify supernatants, 
which bind to receptors of a particular cell. In addition, when tested against fibroblast cell lines, 
supernatants removed from cells containing this gene activated the EGR1 (early growth response 

25 gene 1) promoter element. Thus, it is likely that this gene activates fibroblast cells through the 
EGRl signal transduction pathway. EGRl is a separate signal transduction pathway from Jak- 
STAT, genes containing the EGRl promoter are induced in various tissues and cell types upon 
activation, leading the cells to undergo differentiation and proliferation (PCT application WO 
98/23435) 

30 Polynucleotide comprising sequences encoding the signal peptide of the protein, e.g. 

VLWLSGLSEPGAA/RQ, can be used in construction of secretion vectors. These vectors would 
then facilitate the secretion of fusion proteins into the media of cells that have been transfected with 
the construct of interest. Antibodies which specifically bine the signal peptide could be used to 
purifly the fusion protein from the media if desired. 

35 Polynucleotides and polypeptides of the invention are useful as reagents for differential 

identification of the tissue(s) or cell type(s) present in a biological sample and for diagnosis of 
diseases and conditions which include, but are not limited to, haemopoietic and gastrointestinal tract 
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disorders and stromatosis, in addition to endothelial, mucosal, or epithelial cell disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides, are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of disorders of the 
above tissues or cells, particularly of the immune and digestive systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues and cell types (e.g. 
hemaopoietic, immune, reproductive, gastrointestinal, endocrine, and cancerous and wounded 
tissues) or bodily fluids (e.g. lymph, serum, plasma, unne, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

10 individual not having the disorder. 

The tissue distributioin in bone marrow cells, fetal liver and fetal kidney, combined with the 
detected calcium flux and EGR1 biological activity, indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune and gastrointestinal tract disorders, and 
stromatosis, particularly tumors and proliferative disorders. More specifically, polynucleotides and 

15 polypeptides corresponding to this gene are useful for the treatment and diagnosis of hematopoietic 
related disorders such as anemia, pancytopenia, leukopenia, thrombocytopenia or leukemia since 
stromal cells are important in the production of cells of hematopoietic lineages. The polypeptides 
and polynucleotides of the invention can be used to enhance hematopoesis as described in 
W09831385, incorporated in its entirety. The uses include bone marrow cell ex vivo culture, bone 

20 marrow transplantation, bone marrow reconstitution, radiotherapy or chemotherapy of neoplasia. 
The gene product may also be involved in lymphopoiesis, therefore, it can be used in immune 
disorders such as infection, inflammation, allergy, immunodeficiency etc. In addition, this gene 
product may have commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell types. Protein 

25 as well as, antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Additionally, since the gene product of 181-16-2-0-A7-CS has been shown to activate the 
EGR1 promoter element, it likely activates EGR1 signaling activity in fibroblasts. Recent data 
shows that activation of EGR1 plays a role in wound repair. The cellular transcription factor early 

30 growth response factor 1 (Egrl) is expressed minutes after acute injury and serves to stimulate the 
production of a class of growth factors whose role is to promote tissue repair. Egr-1 expression at 
the site of dermal wounding in rodents promotes angiogenesis in vitro and in vivo, increases 
collagen production, and accelerates wound closure. These results show that Egr-1 gene therapy 
accelerates the normal healing process (Human Gene Ther 2000, vol 1 1(15):2 143-58). Thus, an 

35 activator of EGR1 signaling, specifically the gene products of 181-16-2-0-A7-CS (polypeptides and 
polynucleotides), would be useful in the wound healing process using the methods described in 
W09941282 and W09932135, incorporated by reference in their entireties. 
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Protein of SEP ID NO;3Q5 (internal designation 187-37-0-0-cl0-CS) 



The protein of SEQ ID NO:305, encoded by the cDNA of SEQ ID NO:64, is highly 
expressed in the prostate and brain. The protein of the invention is strongly homologous to the D9 
protein, found in both humans (GNP accession number: U95006 and U95007) and in mice (GNP 
5 accession number: U95003, U95004, and U95005). D9 is a myeloid precursor protein transcript 
regulated by the retinoic acid receptor a, hereafter referred to as RAR-a (Scott et al. Blood 1996; 
88: 2517-30). 

Retinoic acid is the active metabolite of vitamin A, which contributes to a wide range of 
biological processes such as cellular differentiation, embryogenesis, and tumor suppression. More 

10 specifically, retinoic acid stimulates myeloid precursor differentiation into mature granulocytes. 
For instance, in vitro treatment of acute promyelocyte leukemia blast cells with retinoic acid 
induces their differentiation (Miyauchi et al. Leuk Lymphoma 1999;33:267-80). In addition, 
treatment with retinoic acid can induce disease remission in patients affected with promyelocyte 
leukemia by causing granulocyte precursor differentiation (Slack et al. Ann Hematol 2000;79:227- 

15 38). 

The diverse range of responses to retinoic acid are mediated by three receptor subtypes: 
RAR-a, RAR-p, and RAR-y. RAR-a has been identified as being important for bone marrow 
maturation of granulocytes (Tsai et al. Genes Dev 1992;6:2258-69). In addition, RAR-a is almost 
invariably involved in acute promyelocytic leukemia cells by a reciprocal translocation between the 

20 long arms of chromosomes 15 and 17 (Alcalay et al., Proc Natl Acad Sci USA 1991;88:1977-81). 
This type of leukemia is mainly characterized by a predominance of malignant promyelocytes, and 
severe hemorragic manifestations resulting from activation of the coagulation cascade and the 
fibrinolytic system (Tallman et al. Semin Thromb Hemost 1999;25:209-15). Reciprocal 
chromosomal translocation leads to the production of a fusion protein that inhibits differentiation 

25 and promotes survival of myeloid precursor cells (Grignani et al. Cell 1993; 74, 423-431). Transient 
transfection of a vector containing RAR-a in a promyelocyte cell line causes upregulation in an 
early manner of several genes, including D9, which is strongly related to protein of SEQ ID NO:305 
(Scott et al. Blood 1996; 88: 2517-30). Thus, it is believed that the protein of SEQ ID NO:305 is a 
myeloid-related protein whose expression is induced by the activation of retinoic acid receptors, 

30 including RAR-a. 

In a preferred embodiment, the protein of the invention or part thereof may be used to assay 
the activity of RAR-a protein or retinoic acid in a biological sample. Specifically, as the expression 
of the protein is believed to be under the direct control of retinoic acid receptors, the level of the 
protein of the invention, or of the mRNA encoding the protein, can serve as a sensitive and 
35 immediate marker for the effects of retinoic acid upon a cell. An ability to detect retinoic acid 

receptor activation in cells using the present protein has numerous uses. For instance, the protein of 
the invention or part thereof can be used to monitor the effects of retinoic acid on cells of a patient 
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undergoing retinoic acid treatment for promyelocyte leukemia (Slack et al. Ann Hematol 
2000;79:227-38). As retinoic acid treatment is associated with frequent retarded dose-dependant 
side effects, it is believed that an assay based on protein of SEQ ID NO: 305 could be used to adjust 
the dose of retinoic acid administered in patients affected with promyelocyte leukemia, in order to 
5 predict and avoid such adverse side-effects (Slack et al. Ann Hematol 2000;79:227-38). 

In another embodiment, the present polypeptides and polynucleotides can be used to 
identify myeloid precursors, as well as brain and prostate tissues. The ability to specifically 
visualize myeloid precursor cells, as well as brain and prostate tissues (and cells derived from the 
tissues), is useful for any of a number of applications, including to determine the origin or identity 

10 of, e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, 
e.g. the evaluation of histological slides. In addition, such assays can be used to examine the extent 
of differentiation in myeloid precursor cells. 

The present invention further relates to in vitro assays and diagnostic kits based on the 
protein of the present invention or part thereof. Such assays may be used for diagnosis of disorders 

1 5 where the protein activity is abnormally downregulated, such as cancer, and hematological 

disorders including leukemia. As the protein of SEQ ID NO:305, RAR-a, and acute promyelocyte 
leukemia are all related, variation in the measured level of the present protein of the invention or 
part thereof can be used as a diagnostic or screening test for acute promyelocyte leukemia, e.g. 
using a biological sample such as serum or plasma. Further, an assay that can detect an abnormal 

20 level of the protein of the invention or part thereof can be used to detect residual disease in acute 
promyelocyte leukemia. Such an assay may be used to aid therapeutic decisions in this disorder, 
e.g. more or less aggressive treatments, the duration of treatment, etc. 

In another embodiment, various methods can be used to modulate activity and/or expression 
of the protein of SEQ ID NO: 305, e.g. for the treatment, attenuation and/or prevention of various 

25 disorders. In one embodiment, any of a number of reagents, e.g. polynucleotides encoding the 
protein of SEQ ID NO:305 or a fragment thereof, the protein of SEQ ID NO:305 itself, or a 
compound that increased the expression or activity of the protein of SEQ ID NO:305, can be 
administered to a patient suffering from, or at risk of developing, various disorders including 
cancer, and hematological diseases such as leukemia, and neutropenia. For instance, but not limited 

30 to it, proteins or other capable of enhancing the expression or activity of the protein of SEQ ID 
NO:305 can be administered to treat patients affected with acute promyelocyte leukemia, in order 
to induce differentiation of the affected cells into mature granulocytes (Slack et al. Ann Hematol 
2000;79:227-38). In still another preferred embodiment, proteins or other compounds capable of 
increasing the expression or activity of the protein of the invention can be used to treat, prevent 

35 and/or attenuate neutropenia or agranulocytosis patients, in order to induce in vivo differentiation of 
myeloid precursors into mature granulocytes. In still another preferred embodiment, proteins or 
other compounds capable of increasing the expression or activity of the protein of SEQ ID NO:305 
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can be used to treat coagulopathic diseases, such as thrombosis or hemorragic manifestations. For 
instance, they can be used to treat disseminated intravascular coagulation, a severe hemorragic 
syndrome. This embodiment is supported by the fact that acute promyelocyte leukemia is 
frequently associated with disseminated intravascular coagulation (Tallman et al. Semin Thromb 
5 Hemost 1999;25:209-15), and disseminated intravascular coagulation is efficiently corrected with 
retinoic acid (Dombret et al. Leukemia 1993;7:2-9). 

In addition, modulation of the expression or activity of the protein of the invention can be 
used to modulate differentiation of cells, e.g. promyelocytic leukemia. In one such embodiment, the 
protein of the invention is inhibited, e.g. using antisense molecules, antibodies, or small molecule 
10 inhibitors of the expression or activity of the protein, in order to maintain the undifferentiated state 
of cells grown in vitro. Alternatively, agents that increase the expression or activity of the protein 
in cells can be used to induce cellular differentiation, e.g. in the preparation of specific cell types in 
vitro for particular therapeutic applications. 

Protein of SEQ ID NQ:248 (internal designation 105-035-2-0-C6-CS) and SEQ ID NO.313 
15 (internal designation 188-28-4-0-D4~CS) 

The proteins of SEQ ID NO:248, encoded by the cDNA of SEQ ID NO:7, and SEQ ID NO: 
3 1 3, encoded by the cDNA of SEQ ID NO:72, are highly expressed in brain, liver, pancreas, and 
testis. The proteins of the invention are nuclear proteins (Miller et al. J Biol Chem 
2000;275:32052-6) that display a membrane-spanning segment from amino acids 58 to 78. These 

20 proteins are homologous to the human RNA polymerase II elongation factor ELL3 (EMBL 
accession number AF2765 12 ; Miller et al. J Biol Chem. 2000; 275:32052-6). In addition, the 
proteins of SEQ ID NO:248 and SEQ ID NO:3 13 share sequence homology with two other 
members of the polymerase II elongation factor family: ELL, and ELL2. The protein of SEQ ID 
NO:3 1 3 is similar to the N-terminal sequence the protein of SEQ ID NO:248, but differs after 

25 residue 240 because of a frameshift that produces a premature stop in the sequence SEQ ID NO;72 
(Miller et al. J Biol Chem 2000; 275:32052-6). Additionally, the alignment of the protein of SEQ 
ID NO:248 with occludin, an integral membrane protein found at tight junctions (Furuse et al. J Cell 
Biol 1994; 127:1617-26), reveals that both proteins display a C-terminal ZO-1 binding domain, with 
a 26% homology over a 108 amino acid segment. Protein SEQ ID NO:3 13 lacks this domain, as its 

30 C-terminal region is truncated as compared to the protein of SEQ ID NO:248. ZO-1 is part of the 
family of membrane-associated guanylate kinase homologs (MAGUKs) believed to be important in 
signal transduction originating from sites of cell-cell contact (Willott et al. Proc Natl Acad Sci USA 
1993; 90:7834-8). 

The proteins of SEQ ID NOs:248 and 313 are RNA polymerase II elongation factors that 
35 increase the catalytic rate of transcription elongation, a phase during which RNA polymerase II 
moves along the DNA and extends the growing RNA chain (Miller et al. J Biol Chem 2000; 



399 



WO 01/42451 PC I /i BOO/01 938 

275:32052-6). Specifically, the proteins of SEQ ID NOs:248 and 313 suppress transient pausing at 
multiple sites along the DNA, thereby altering the K„, and/or the of the polymerase (Miller et 
al. J Biol Chem 2000; 275:32052-6). The present proteins belong to a family that is known to 
include one virally encoded protein (Tat) and six cellular proteins (SIX, P-TEFb, TFIIF, Elongin 
5 (SIII), ELL and ELL2). 

A growing body of evidence suggests that mis-regulation of elongation may be a key 
element in a variety of human diseases (see, Aso et al. J Clin Invest 1996; 97:1561-9). For instance, 
two RNA polymerase II elongation proteins have been implicated in oncogenesis: ELL, which is a 
frequent target for translocation in acute myeloid leukemia (Thirman et al. Proc Natl Acad Sci USA 

10 1994; 91:12110-4 ; Mitani et al. Blood 1995;85:2017-24), and elongin, which is a transcription 
factor regulated by the product of the von Hippel-Lindau tumor suppressor gene, which is itself 
mutated in the majority of clear-cell renal carcinomas and in families with von Hippel-Lindau 
disease (Duanetal. Science 1995;269:1402-6, Kibel etal. Science 1995; 269:1444-6). In addition, 
overexpression of ELL leads to the transformation of fibroblasts (Kanda et al. J Biol Chem. 1998 

15 27; 273:5248-52). Thus, the proteins of SEQ ID NOs:248 and 313 may be important for 
oncogenesis of multiple types of neoplastic diseases, especially hematological malignancies. 

In one embodiment, the present proteins are used to increase the rate of transcription in 
vitro. Such an increase can be used for any of the large number of in vitro transcription reactions 
which are routinely used for molecular biological applications, e.g. for the preparation of RNA, for 

20 protein production, for the characterization of promoters and transcription factors, etc. 

In another embodiment, the present invention provides diagnostic tools for the detection of 
mutations in the genes encoding SEQ ID NOs:248 or 313. Such mutations may be detected by a 
variety of techniques, including RNase and SI protection assays; alterations in electrophoretic 
mobility of DNA fragments in gels, with or without denaturing agents such as SSCP or DGGE; 

25 dHPLC; and direct DNA sequencing. The detection of mutations in the genes encoding SEQ ID 
NOs:248 or 313 are useful for the detection of a number of diseases and conditions, such as cancers 
and hematological malignancies including leukemia. For example, the RNA polymerase II 
Elongation Factor ELL gene undergoes frequent translocations in acute myeloid leukemia (Thirman 
etal. Proc Natl Acad Sci USA 1994; 91:12110-4 ; Mitani et al. Blood 1995; 85:2017-24), and it is 

30 likely that other elongation factors are involved in additional such diseases. 

Another embodiment of the present inventions relates to compositions and methods for 
using the proteins or part thereof to specifically visualize myeloid precursor cells, as well as 
pancreas, liver and testis tissues (and cells derived from the tissues). The ability to detect such cell 
types is useful for any of a number of applications, including to determine the origin or identity of, 

35 e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, e.g. 
the evaluation of histological slides. In addition, such methods can be used to examine the extent of 
differentiation in myeloid or myeloid-progenitor cells for staging of leukemia or any other 
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neoplastic disorder. Any method for detecting the presence of the proteins of the invention, or 
nucleic acids encoding the proteins, can be used, including methods involving the use of antibodies 
lmmunospecific for the proteins of invention. Such antibodies can be used in various methods 
including radioimmunoassays, competitive binding assays, Western Blot analysis and enzyme - 
5 linked immunosorbant assay (ELISA) assays, or any other technique known to those skilled in the 
art. In another embodiment, the present protein or part thereof can be used for the treatment, 
attenuation and/or prevention of conditions associated with unbalanced amounts and/or activity of 
the protein of SEQ ID NO:248 or 3 1 3. Other modulatory substances can also be used in such 
embodiments, including chemical compounds such as agonists and antagonists, nucleic acids 
!0 including antisense and ribozyrne sequences, and antibodies. In a preferred embodiment, such 
substances are employed for the treatment or prevention of certain types of neoplastic disorders 
associated such as cancer or hematological malignancies such as leukemia. In such embodiments, 
where an increased level of expression or activity of the present proteins is correlated with the 
presence of a disease such as cancer, the disease can be treated or prevented using any agent that 
5 can provoke a decrease in the level of activity or expression of the protein, such as antibodies, 

antisense molecules, ribozymes, dominant negative forms of the protein, compounds that inhibit the 
expression or activity of the proteins, and others. Alternatively, in cases where a decreased level of 
expression or activity of the proteins is correlated with the presence of a disease such as cancer, the 
disease can be treated using any agent that can cause an increase in the expression or activity of the 
protein, such as polynucleotides encoding the proteins, purified forms of the proteins, or any 
compound that causes an increase in the expression or activity of the proteins. Further, any 
detection of a correlation between the level of expression or activity of the protein and the presence 
or absence of a disease can be used to develop diagnostic or screening tools for the detection of the 
disease itself, or of a predisposition for the disease. 

Uses of antibodies 

Antibodies of the present invention have uses that include, but are not limited to, methods 
known in the art to purify, detect, and target the polypeptides of the present invention including 
both in vitro and in vivo diagnostic and therapeutic methods. An example of such use using 
immunoaffinity chromatography is given below. The antibodies of the present invention may be 
used either alone or in combination with other compositions. For example, the antibodies have use 
in immunoassays for qualitatively and quantitatively measuring levels of antigen-bearing substances, 
including the polypeptides of the present invention, in biological samples (See, e.g., Harlow et ai, 
1988). (Incorporated by reference in the entirety). The antibodies may also be used in therapeutic 
compositions for killing cells expressing the protein or reducing the levels of the protein in the body. 

The invention further relates to antibodies that act as agonists or antagonists of the 
polypeptides of the present invention. For example, the present invention includes antibodies that 
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disrupt the receptor/1 igand interactions with the polypeptides of the invention either partially or 
fully. Included are both receptor-specific antibodies and ligand-specific antibodies. Included are 
receptor-specific antibodies, which do not prevent ligand binding but prevent receptor activation. 
Receptor activation (i.e., signaling) may be determined by techniques descnbed herein or otherwise 
5 known in the art. Also include are receptor-specific antibodies which both prevent ligand binding 
and receptor activation. Likewise, included are neutralizing antibodies that bind the ligand and 
prevent binding of the ligand to the receptor, as well as antibodies that bind the ligand, thereby 
preventing receptor activation, but do not prevent the ligand from binding the receptor. Further 
included are antibodies that activate the receptor. These antibodies may act as agonists for either all 

10 or less than all of the biological activities affected by ligand-mediated receptor activation. The 
antibodies may be specified as agonists or antagonists for biological activities comprising specific 
activities disclosed herein. The above antibody agonists can be made using methods known in the 
art. See e.g., WO 96/40281; US Patent 5,81 1,097; Deng et al (1998); Chen et al (1998); Harrop et 
al (1998); Zhu et al (1998); Yoon et al. (1998); Prat et al. (1998); Pitard et al (1997); Liautard et 

15 al ( 1 997); Carlson et al. ( 1 997); Taryman et al ( 1 995); Muller et al ( 1 998); Bartunek et al. ( 1 996) 
(said references incorporated by reference in their entireties). 

As discussed above, antibodies of the polypeptides of the invention can, in turn, be utilized 
to generate anti-idiotypic antibodies that "mimic" polypeptides of the invention using techniques 
well known to those skilled in the art (See, e.g. Greenspan and Bona (1989) and Nissinoff (1991), 

20 which disclosures are hereby incorporated by reference in their entireties). For example, antibodies 
which bind to and competitively inhibit polypeptide multimerization or binding of a polypeptide of 
the invention to ligand can be used to generate antiidiotypes that "mimic" the polypeptide 
multimerization or binding domain and, as a consequence, bind to and neutralize polypeptide or its 
ligand. Such neutralization anti-idiotypic antibodies can be used to bind a polypeptide of the 

25 invention or to bind its ligands/receptors, and thereby block its biological activity. 

Immunoaffinitv Chromatography 

Antibodies prepared as descnbed herein are coupled to a support. Preferably, the antibodies 
are monoclonal antibodies, but polyclonal antibodies may also be used. The support may be any of 
those typically employed in immunoaffinity chromatography, including Sepharose CL-4B 

30 (Pharmacia, Piscataway, NJ), Sepharose CL-2B (Pharmacia, Piscataway, NJ), Affi-gel 10 (Biorad, 
Richmond, CA), or glass beads. 

The antibodies may be coupled to the support using any of the coupling reagents typically 
used in immunoaffinity chromatography, including cyanogen bromide. After coupling the antibody 
to the support, the support is contacted with a sample which contains a target polypeptide whose 

35 isolation, purification or enrichment is desired. The target polypeptide may be a polypeptide 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
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included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, variants and fragments thereof, or a fusion 
protein composing said selected polypeptide or a fragment thereof. 

Preferably, the sample is placed in contact with the support for a sufficient amount of time 
5 and under appropriate conditions to allow at least 50% of the target polypeptide to specifically bind 
to the antibody coupled to the support. 

Thereafter, the support is washed with an appropriate wash solution to remove polypeptides 
which have non-specifically adhered to the support. The wash solution may be any of those 
typically employed in immunoaffinity chromatography, including PBS, Tris-lithium chlonde buffer 
10 (0. 1 M lysine base and 0.5M lithium chloride, pH 8.0), Tris -hydrochloride buffer (0.05M Tns- 
hydrochloride, pH 8.0), or Tns/Triton/NaCl buffer (50mM Tris.cl, pH 8.0 or 9.0, 0. 1% Triton X- 
100, and 0.5MNaCl). 

After washing, the specifically bound target polypeptide is eluted from the support using the 
high pH or low pH elution solutions typically employed in immunoaffinity chromatography. In 
15 particular, the elution solutions may contain an eluant such as triethanolamine, diethylamine, 
calcium chloride, sodium thiocyanate, potasssium bromide, acetic acid, or glycine. In some 
embodiments, the elution solution may also contain a detergent such as Triton X-100 or octyl-beta- 
D-glucoside. 

Import vectors 

20 The GENSET polypeptides of the invention may also be used as a carrier to import a 

protein or peptide of interest, so-called cargo, into tissue-culture cells or in host organisms. A 
hydrophobic region of a GENSET polypeptide or a fragment thereof, preferably the signal peptide 
of a sequence selected from the group consisting of of SEQ ID Nos: 1-3 1 and 33-143 and clones 
inserts of the deposited clone pool, more preferably the short core hydrophobic region (h) of signal 

25 peptides may be used as a carrier. 

When cell permeable peptides of limited size (approximately up to 25 amino acids) are to 
be translocated across cell membrane, chemical synthesis may be used in order to add the h region 
to either the C-terminus or the N-terminus to the cargo peptide of interest. Alternatively, when 
longer peptides or proteins are to be imported into cells, nucleic acids can be genetically engineered, 

30 using techniques familiar to those skilled in the art, in order to link the cDNA sequence or fragment 
thereof encoding the hydrophobic region to the 5' or the 3' end of a DNA sequence coding for a 
cargo polypeptide. Such genetically engineered nucleic acids are then translated either in vitro or in 
vivo after transfection into appropriate cells, using conventional techniques to produce the resulting 
cell permeable polypeptide. Suitable hosts cells are then simply incubated with the cell permeable 

35 polypeptide which is then translocated across the membrane. 
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This method may be applied to study diverse intracellular functions and cellular processes. 
For instance, it has been used to probe functionally relevant domains of intracellular proteins and to 
examine protein-protein interactions involved in signal transduction pathways (Lin et aL, J. Biol. 
Chem., 270: 14225-14258(1995); Lin et aL, J. Biol. Chem., 271: 5305-5308 (1996); Rojas et al. t J. 
5 Biol. Chem., 271 : 27456-27461 (1996); Rojas et aL, Nature Biotech., 16: 370-375 (1998); Liu et aL, 
Proc. Natl. Acad. ScL USA, 93: 1 1819-1 1824 (1996); Rojas et aL, Bioch. Biophys. Res. Commun., 
234: 675-680(1997) Du et aL, J. Peptide Res., 51: 235-243 (1998)). 

Such techniques may be used in cellular therapy to import proteins producing therapeutic 
effects. For instance, cells isolated from a patient may be treated with imported therapeutic proteins 
10 and then re-introduced into the host organism. 

Alternatively, the hydrophobic region of signal peptides of the present invention could be 
used in combination with a nuclear localization signal to deliver nucleic acids into cell nucleus. 
Such oligonucleotides may be antisense oligonucleotides or oligonucleotides designed to form triple 
helixes, as described herein, in order to respectively inhibit processing or maturation of a target 
15 cellular RNA. 

Expression of GENSET products 



Spatial expression of the GENSET genes of the invention 

Tissue expression of the cDNAs of the present invention was examined. Table IX list the 
Genset's libraries of tissues and cell types examined that express the polynucleotides of the present 

20 invention. The tissues and cell types examined for polynucleotide expression were: adrenal gland 
(AG), bone marrow (BM) 5 brain (Br), cancerous protate (CP), cerebellum (Ce), colon (Co), 
dystrophic muscle (DM), fetal brain (FB), fetal kidney (FK), fetal liver (FL), heart (He), 
hypertrophic prostate (HP), kidney (Ki), liver (Li), lung (Lu), lung cells (LC), lymph ganglia (LG), 
lymphocytes (Ly), muscle (Mu), Ovary (Ov), pancreas (Pa), pituitary gland (PG), placenta (PI), 

25 prostate (Pr), salivary gland (SG), spinal cord (SC), spleen (Sp), stomach/intestine (SI), substantia 
nigra (SN), testis (Te), thyroid (Ty), umbilical cord (UC) and uterus (Ut). 

For each cDNA referred to by its sequence identification number (first column), the number 
of proprietary 5'ESTs (i.e. cDNA fragments) expressed in a particular tissue referred to by its name 
is indicated after a semi column (second column). In addition, the bias in the spatial distribution of 

30 the polynucleotide sequences of the present invention was examined by comparing the relative 
proportions of the biological polynucleotides of a given tissue using the following statistical 
analysis. The under- or over-representation of a polynucleotide of a given cluster in a given tissue 
was performed using the normal approximation of the binomial distribution. When the observed 
proportion of a polynucleotide of a given tissue in a given consensus had less than 1% chance to 

35 occur randomly according to the chi2 test, the frequency bias was reported as "preferred". The 

results are given in Table X as follows. For each polynucleotide showing a bias in tissue distribution 
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as referred to by its sequence identification number in the first column, the list of tissues where the 
polynucleotides are under-represented is given in the second column entitled "low frequency 
expression" and the list of tissues where the polynucleotides are over-represented is given in the 
third column entitled "high frequency expression". 
5 The cellular localization of some polypeptides of the invention was also determined using 

the "psort software" (Nakai, and Horton, (1999); Nakai and Kanehisa, (1992), which disclosures are 
hereby incorporated by reference in their entireties). For each polypeptide identified by its 
sequence identification number in the first column, the second column of Table XI list the predicted 
subcellular localization. 

1 0 Evaluation of Expression Levels and Patterns of GENS ET mRNAs 

The spatial and temporal expression patterns of GENSET mRNAs, as well as their 
expression levels, may also be further determined as follows. 

Expression levels and patterns of GENSET mRNAs may be analyzed by solution 
hybridization with long probes as described in International Patent Application No. WO 97/05277, 

1 5 the entire contents of which are hereby incorporated by reference. Briefly, a GENSET 
polynucleotide, or fragment thereof corresponding to the gene encoding the mRNA to be 
characterized is inserted at a cloning site immediately downstream of a bacteriophage (T3, T7 or 
SP6) RNA polymerase promoter to produce antisense RNA. Preferably, the GENSET 
polynucleotide is at least a 100 nucleotides in length. The plasmid is linearized and transcribed in 

20 the presence of ribonucleotides comprising modified ribonucleotides (i.e. biotin-UTP and DIG- 
UTP). An excess of this doubly labeled RNA is hybridized in solution with mRNA isolated from 
cells or tissues of interest. The hybridizations are performed under standard stringent conditions 
(40-50°C for 16 hours in an 80% formamide, 0.4 M NaCl buffer, pH 7-8). The unhybridized probe 
is removed by digestion with ribonucleases specific for single-stranded RNA (i.e. RNases CL3, Tl, 

25 Phy M, U2 or A). The presence of the biotin-UTP modification enables capture of the hybrid on a 
microtitration plate coated with srreptavidin. The presence of the DIG modification enables the 
hybrid to be detected and quantified by ELISA using an anti-DIG antibody coupled to alkaline 
phosphatase. 

The GENSET cDNAs, or fragments thereof may also be tagged with nucleotide sequences 
30 for the serial analysis of gene expression (SAGE) as disclosed in UK Patent Application No. 2 305 
241 A, the entire contents of which are incorporated by reference. In this method, cDNAs are 
prepared from a cell, tissue, organism or other source of nucleic acid for which it is desired to 
determine gene expression patterns. The resulting cDNAs are separated into two pools. The 
cDNAs in each pool are cleaved with a first restriction endonuclease, called an "anchoring enzyme," 
35 having a recognition site which is likely to be present at least once in most cDNAs. The fragments 
which contain the 5' or 3' most region of the cleaved cDNA are isolated by binding to a capture 
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medium such as streptavidin coated beads. A first oligonucleotide linker having a first sequence for 
hybridization of an amplification primer and an internal restriction site for a "tagging endonuclease" 
is hgated to the digested cDNAs in the first pool. Digestion with the second endonuclease produces 
short "tag" fragments from the cDNAs. A second oligonucleotide having a second sequence for 
5 hybridization of an amplification primer and an internal restriction site is ligated to the digested 
cDNAs in the second pool. The cDNA fragments in the second pool are also digested with the 
"tagging endonuclease" to generate short "tag" fragments derived from the cDNAs in the second 
pool. The "tags" resulting from digestion of the first and second pools with the anchoring enzyme 
and the tagging endonuclease are ligated to one another to produce "ditags." In some embodiments, 

10 the ditags are concatamerized to produce ligation products containing from 2 to 200 ditags. The tag 
sequences are then determined and compared to the sequences of the GENSET cDNAs to determine 
which genes are expressed in the cell, tissue, organism, or other source of nucleic acids from which 
the tags were derived. In this way, the expression pattern of a GENSET gene in the cell, tissue, 
organism, or other source of nucleic acids is obtained. 

15 Quantitative analysis of GENSET gene expression may also be performed using arrays. For 

example, quantitative analysis of gene expression may be performed with GENSET 
polynucleotides, or fragments thereof in a complementary DNA microarray as described by Schena 
et al. (1995 and 1996) which disclosures are hereby incorporated by reference in their entireties. 
GENSET cDNAs or fragments thereof are amplified by PCR and arrayed from 96-well microtiter 

20 plates onto silylated microscope slides using high-speed robotics. Printed arrays are incubated in a 
humid chamber to allow rehydration of the array elements and rinsed, once in 0.2% SDS for 1 min, 
twice in water for 1 min and once for 5 min in sodium borohydnde solution. The arrays are 
submerged in water for 2 min at 95°C, transferred into 0.2% SDS for 1 min, rinsed twice with 
water, air dried and stored in the dark at 25°C. Cell or tissue mRNA is isolated or commercially 

25 obtained and probes are prepared by a single round of reverse transcription. Probes are hybridized 
to 1 cm 2 microarrays under a 14 x 14 mm glass coverslip for 6-12 hours at 60°C. Arrays are 
washed for 5 min at 25°C in low stringency wash buffer (1 X SSC/0. 2% SDS), then for 10 min at 
room temperature in high stringency wash buffer (0. IX SSC/0. 2% SDS). Arrays are scanned in 
0.1X SSC using a fluorescence laser scanning device fitted with a custom filter set. Accurate 

30 differential expression measurements are obtained by taking the average of the ratios of two 
independent hybridizations. 

Quantitative analysis of the expression of genes may also be performed with GENSET 
cDNAs or fragments thereof in complementary DNA arrays as described by Pieru et al. (1996), 
which disclosure is hereby incorporated by reference in its entirety. The GENSET polynucleotides 

35 of the invention or fragments thereof are PCR amplified and spotted on membranes. Then, mRNAs 
originating from various tissues or cells are labeled with radioactive nucleotides. After 
hybridization and washing in controlled conditions, the hybridized mRNAs are detected by 
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phospho-imaging or autoradiography. Duplicate experiments are performed and a quantitative 
analysis of differentially expressed mRNAs is then performed. 

Alternatively, expression analysis of GENSET genes can be done through high density 
nucleotide arrays as described by Lockhart et al. (1996) and Sosnowski et al. (1 997), which 
5 disclosures are hereby incorporated by reference in their entireties. Oligonucleotides of 15-50 
nucleotides corresponding to sequences of a GENSET polynucleotide or fragments thereof are 
synthesized directly on the chip (Lockhart et al, supra) or synthesized and then addressed to the 
chip (Sosnowski et ai, supra). Preferably, the oligonucleotides are about 20 nucleotides in length. 
cDNA probes labeled with an appropriate compound, such as biotin, digoxigenin or fluorescent dye, 
0 are synthesized from the appropriate mRNA population and then randomly fragmented to an 
average size of 50 to 100 nucleotides. The said probes are then hybridized to the chip. After 
washing as described in Lockhart et al, (supra) and application of different electric fields 
(Sosnowsky et al, supra), the dyes or labeling compounds are detected and quantified. Duplicate 
hybridizations are performed. Comparative analysis of the intensity of the signal originating from 
5 cDNA probes on the same target oligonucleotide in different cDNA samples indicates a differential 
expression of the GENSET mRNA. 

Uses of GENSET expression data 

Once the expression levels and patterns of a GENSET mRNA has been determined using 
any technique known to those skilled in the art, in particular those described in the section entitled 
"Evaluation of Expression Levels and Patterns of GENSET mRNAs", or using the instant 
disclosure, these information may be used to design GENSET specific markers for detection, 
identification, screening and diagnosis purposes as well as to design DNA constructs with an 
expression pattern similar to a GENSET expression pattern. 

Detection of GENSET expression andVor biological activity 

The invention further relates to methods of detection of GENSET expression and/or 
biological activity in a biological sample using the polynucleotide and polypeptide sequences 
described herein. Such method scan be used, for example, as a screen for normal or abnormal 
GENSET expression and/or biological activity and, thus, can be used diagnostically. The biological 
sample for use in the methods of the present invention includes a suitable sample from, for example, 
a mammal, particularly a human. For example, the sample can be issued from tissues or cell lines 
having the same origin as tissues or cell lines in which the polypeptide is known to be expressed 
using the data from Table IX. 

Detection of GENSET products 

The invention further relates to methods of detection of GENSET polynucleotides or 
polypeptides in a sample using the sequences described herein and any techniques known to those 
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skilled in the art. For example, a labeled polynucleotide probe having all or a functional portion of 
the nucleotide sequence of a GENSET polynucleotide can be used in a method to detect a GENSET 
polynucleotide in a sample. In one embodiment, the sample is treated to render the polynucleotides 
in the sample available for hybridization to a polynucleotide probe, which can be DNA or RNA. 
5 The resulting treated sample is combined with a labeled polynucleotide probe having all or a portion 
of the nucleotide sequence of the GENSET cDNA or genomic sequence, under conditions 
appropriate for hybridization of complementary sequences to occur. Detection of hybridization of 
polynucleotides from the sample with the labeled nucleic probe indicates the presence of GENSET 
polynucleotides in a sample. The presence of GENSET mRNA is indicative of GENSET 
10 expression. 

Consequently, the invention comprises methods for detecting the presence of a 
polynucleotide comprising a nucleotide sequence selected from a group consisting of the sequences 
of SEQ ID Nos: 1-241, the sequences of clone inserts of the deposited clone pool, sequences fully 
complementary thereto, fragments and variants thereof in a sample. In a first embodiment, said 
1 5 method comprises the following steps of: 

a) bringing into contact said sample and a nucleic acid probe or a plurality of nucleic acid 
probes which hybridize to said selected nucleotide sequence; and 

b) detecting the hybrid complex formed between said probe or said plurality of probes and 
said polynucleotide. 

20 In a preferred embodiment of the above detection method, said nucleic acid probe or said 

plurality of nucleic acid probes is labeled with a detectable molecule. In another preferred 
embodiment of the above detection method, said nucleic acid probe or said plurality of nucleic acid 
probes has been immobilized on a substrate. In still another preferred embodiment, said nucleic 
acid probe or said plurality of nucleic acid probes has a sequence comprised in a sequence 

25 complementary to said selected sequence. 

In a second embodiment, said method comprises the following steps of: 
a) contacting said sample with amplification reaction reagents comprising a pair of 
amplification primers located on either side of the region of said nucleotide sequence to be 
amplified; 

30 b) performing an amplification reaction to synthesize amplification products containing said 

region of said selected nucleotide sequence; and 

c) detecting said amplification products. 

In a preferred embodiment of the above detection method, when the polynucleotide to be 
amplified is a RNA molecule, preliminary reverse transcription and synthesis of a second cDNA 
35 strand are necessary to provide a DNA template to be amplified. In another preferred embodiment 
of the above detection method, the amplification product is detected by hybridization with a labeled 
probe having a sequence which is complementary to the amplified region. In still another preferred 
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embodiment, at least one of said amplification primer has a sequence comprised in said selected 
sequence or in the sequence complementary to said selected sequence. 

Alternatively, a method of detecting GENSET expression in a test sample can be 
accomplished using any product which binds to a GENSET polypeptide of the present invention or 
5 a portion of a GENSET polypeptide. Such products may be antibodies, binding fragments of 
antibodies, polypeptides able to bind specifically to GENSET polypeptides or fragments thereof, 
including GENSET agonists and antagonists. Detection of specific binding to the antibody indicates 
the presence of a GENSET polypeptide in the sample (e.g., ELISA). 

Consequently, the invention is also directed to a method for detecting specifically the 
10 presence of a GENSET polypeptide according to the invention in a biological sample, said method 
comprising the following steps of: 

a) bringing into contact said biological sample with a product able to bind to a polypeptide 
of the invention or fragments thereof; 

b) allowing said product to bind to said polypeptide to form a complex; and 
1 5 b) detecting said complex. 

In a preferred embodiment of the above detection method, the product is an antibody. In a 
more preferred embodiment, said antibody is labeled with a detectable molecule. In another more 
preferred embodiment of the above detection method, said antibody has been immobilized on a 
substrate. 

20 In addition, the invention also relates to methods of determining whether a GENSET 

product (e.g. a polynucleotide or polypeptide) is present or absent in a biological sample, said 
methods comprising the steps of: 

a) obtaining said biological sample from a human or non-human animal, preferably a 
mammal; 

25 b) contacting said biological sample with a product able to bind to a GENSET 

polynucleotide or polypeptide of the invention; and 

c) determining the presence or absence of said GENSET product in said biological sample. 
Compounds that specifically binds a GENSET product may either be compounds binding to 

a GENSET polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 
30 fragments) or compounds bindint to a GENSET polynucleotide (e.g. a complementary probe or 
primer). 

The present invention also relates to kits that can be used in the detection of GENSET 
expression products. The kit can comprise a compound that specifically binds a GENSET 
polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 fragments) 
35 or a GENSET mRNA (e.g. a complementary probe or primer), for example, disposed within a 
container means. The kit can further comprise ancillary reagents, including buffers and the like. 



Detection of a GENSET biological activity 
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The invention further includes methods of detecting specifically a GENSET biological 
activity. Assessing the GENSET biological activity may be performed using a variety of 
techniques, including those described in the section entitled "Erreur! Source du renvoi 
introuvable.". 

5 Consequently, the invention is directed to a method for detecting specifically GENSET 

biological activity in a biological sample, said method comprising the following steps: 

a) obtaining a biological sample from a human or non-human mammal; and 

b) detecting a GENSET biological activity. 

The present invention also relates to kits that can be used in the detection of GENSET 
10 biological activity. 

Identification of a specific context of GENSET expression 

When the expression pattern of a GENSET mRNA shows that a GENSET gene is 
specifically expressed in a given context, probes and primers specific for this gene as well as 
antibodies binding to the GENSET polynucleotide may then be used as markers for a specific 

15 context. Examples of specific contexts are: specific expression in a given tissue/cell or tissue/cell 
type, expression at a given stage of development of a process such as embryo development or 
disease development, or specific expression in a given organelle. Such primers, probes, and 
antibodies are useful commercially to identify tissues/cell s/organelles of unknown origin, for 
example, forensic samples, differentiated rumor tissue that has metastasized to foreign bodily sites, 

20 or to differentiate different tissue types in a tissue cross-section using any technique known to those 
skilled in the art including in situ PCR or immunochemistry for example. 

For example, the cDNAs and proteins of the sequence listing and fragments thereof, may be 
used to distinguish human tissues/cells from non-human tissues/cells and to distinguish between 
human tissues/cells/organelles that do and do not express the polynucleotides comprising the 

25 cDNAs. By knowing the expression pattern of a given GENSET, either through routine 

experimentation or by using the instant disclosure, the polynucleotides and polypeptides of the 
present invention may be used in methods of determining the identity of an unknown tissue/cell 
sample/organelle. As part of determining the identity of an unknown tissue/cell sample/organelle, 
the polynucleotides and polypeptides of the present invention may be used to determine what the 

30 unknown tissue/cell sample is and what the unknown sample is not. For example, if a cDNA is 
expressed in a particular tissue/cell type/organelle, and the unknown tissue/cell sample/organelle 
does not express the cDNA, it may be inferred that the unknown tissue/cells are either not human or 
not the same human tissue/cell type/organelle as that which expresses the cDNA. These methods of 
determining tissue/cell/organelle identity are based on methods which detect the presence or 

35 absence of the mRNA (or corresponding cDNA) in a tissue/cell sample using methods well know in 
the art (e.g., hybridization, PCR based methods, immunoassays, immunochemistry, ELISA). 
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Examples of such techniques are described in more detail below. Therefore, the invention 
encompasses uses of the polynucleotides and polypeptides of the invention as tissue markers. In a 
preferred embodiment, polynucleotides preferentially expressed in given tissues as indicated in 
Table X and polypeptides encoded by such polynucleotides are used for this purpose. The 
5 invention also encompasses uses of polypeptides of the invention as organelle markers. In a 
preferred embodiment, polypeptides preferentially expressed in given subcellular compartment as 
indicated in Table XI are used for this purpose. 



Consequently, the present invention encompasses methods of identification of a tissue/cell 
10 type/subcellular compartment, wherein said method includes the steps of: 

a) contacting a biological sample which identity is to be assayed with a product able to bind 
a GENSET product; and 

b) determining whether a GENSET product is expressed in said biological sample. 
Products that are able to bind specifically to a GENSET product, namely a GENSET 

15 polypeptide or a GENSET mRNA, include GENSET binding proteins, antibodies or binding 

fragments thereof (e.g. F(ab')2 fragments), as well as GENSET complementary probes and primers. 

Step b) may be performed using any detection method known to those skilled in the art 
including those disclosed herein, especially in the section entitled "Detection of GENSET 
expression and/or biological activity".. 

20 Identification of Tissue Types or Cell Species by Means of Labeled Tissue Specific Antibodies 

Identification of specific tissues is accomplished by the visualization of tissue specific 
antigens by means of antibody preparations which are conjugated, directly (e.g., green fluorescent 
protein) or indirectly to a detectable marker. Selected labeled antibody species bind to their specific 
antigen binding partner in tissue sections, cell suspensions, or in extracts of soluble proteins from a 

25 tissue sample to provide a pattern for qualitative or semi-qualitative interpretation. 

Antisera for these procedures must have a potency exceeding that of the native preparation, 
and for that reason, antibodies are concentrated to a mg/ml level by isolation of the gamma globulin 
fraction, for example, by ion-exchange chromatography or by ammonium sulfate fractionation. 
Also, to provide the most specific antisera, unwanted antibodies, for example to common proteins, 

30 must be removed from the gamma globulin fraction, for example by means of insoluble 

immunoabsorbents, before the antibodies are labeled with the marker. Either monoclonal or 
heterologous antisera is suitable for either procedure. 

A. Immunohistochemical Techniques 

Purified, high-titer antibodies, prepared as described above, are conjugated to a detectable 

35 marker, as described, for example, by Fudenberg, (1980) or Rose et ai, (1980), which disclosures 

are hereby incorporated by reference in their entireties. 
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A fluorescent marker, either fluorescein or rhodamine, is preferred, but antibodies can also 
be labeled with an enzyme that supports a color producing reaction with a substrate, such as 
horseradish peroxidase. Markers can be added to tissue-bound antibody in a second step, as 
described below. Alternatively, the specific anti-tissue antibodies can be labeled with ferritin or 
5 other electron dense particles, and localization of the ferritin coupled antigen-antibody complexes 
achieved by means of an electron microscope. In yet another approach, the antibodies are 
radiolabeled, with, for example ,25 I, and detected by overlaying the antibody treated preparation 
with photographic emulsion. Preparations to carry out the procedures can comprise monoclonal or 
polyclonal antibodies to a single protein or peptide identified as specific to a tissue type, for 

10 example, brain tissue, or antibody preparations to several antigenically distinct tissue specific 
antigens can be used in panels, independently or in mixtures, as required. Tissue sections and cell 
suspensions are prepared for immunohistochemical examination according to common histological 
techniques. Multiple cryostat sections (about 4 um, unfixed) of the unknown tissue and known 
control, are mounted and each slide covered with different dilutions of the antibody preparation. 

1 5 Sections of known and unknown tissues should also be treated with preparations to provide a 
positive control, a negative control, for example, pre-immune sera, and a control for non-specific 
staining, for example, buffer. Treated sections are incubated in a humid chamber for 30 min at 
room temperature, rinsed, then washed in buffer for 30-45 min. Excess fluid is blotted away, and 
the marker developed. If the tissue specific antibody was not labeled in the first incubation, it can 

20 be labeled at this time in a second antibody-antibody reaction, for example, by adding fluorescein- 
or enzyme-conjugated antibody against the immunoglobulin class of the antiserum-producing 
species, for example, fluorescein labeled antibody to mouse IgG. Such labeled sera are 
commercially available. The antigen found in the tissues by the above procedure can be quantified 
by measuring the intensity of color or fluorescence on the tissue section, and calibrating that signal 

25 using appropriate standards. 

B. Identification of Tissue Specific Soluble Proteins 

The visualization of tissue specific proteins and identification of unknown tissues from that 
procedure is carried out using the labeled antibody reagents and detection strategy as described for 
immunohistochemistry; however the sample is prepared according to an electrophoretic technique 

30 to distribute the proteins extracted from the tissue in an orderly array on the basis of molecular 

weight for detection. A tissue sample is homogenized using a Virtis apparatus; cell suspensions are 
disrupted by Dounce homogeni^ation or osmotic lysis, using detergents in either case as required to 
disrupt cell membranes, as is the practice in the art. Insoluble cell components such as nuclei, 
microsomes, and membrane fragments are removed by ultracentrifugation, and the soluble protein- 

35 containing fraction concentrated if necessary and reserved for analysis. A sample of the soluble 
protein solution is resolved into individual protein species by conventional SDS polyacrylamide 
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electrophoresis as described, for example, by Davis et al., Section 19-2 (1986), using a range of 
amounts of polyacrylamide in a set of gels to resolve the entire molecular weight range of proteins 
to be detected in the sample. A size marker is run in parallel for purposes of estimating molecular 
weights of the constituent proteins. Sample size for analysis is a convenient volume of from 5 to55 
5 ul, and containing from about 1 to 100 ug protein. An aliquot of each of the resolved proteins is 
transferred by blotting to a nitrocellulose filter paper, a process that maintains the pattern of 
resolution. Multiple copies are prepared. The procedure, known as Western Blot Analysis, is well 
described in Davis et al, (1986) Section 19-3. One set of nitrocellulose blots is stained with 
Coomassie Blue dye to visualize the entire set of proteins for comparison with the antibody bound 

10 proteins. The remaining nitrocellulose filters are then incubated with a solution of one or more 
specific antisera to tissue specific proteins prepared as described herein. In this procedure, as in 
procedure A above, appropriate positive and negative sample and reagent controls are run. 

In either procedure A or B, a detectable label can be attached to the primary tissue antigen- 
primary antibody complex according to various strategies and permutations thereof. In a 

1 5 straightforward approach, the primary specific antibody can be labeled; alternatively, the unlabeled 
complex can be bound by a labeled secondary anti-IgG antibody. In other approaches, either the 
primary or secondary antibody is conjugated to a biotin molecule, which can, in a subsequent step, 
bind an avid in conjugated marker. According to yet another strategy, enzyme labeled or radioactive 
protein A, which has the property of binding to any IgG, is bound in a final step to either the 

20 primary or secondary antibody. The visualization of tissue specific antigen binding at levels above 
those seen in control tissues to one or more tissue specific antibodies, prepared from the gene 
sequences identified from cDNA sequences, can identify tissues of unknown origin, for example, 
forensic samples, or differentiated tumor tissue that has metastasized to foreign bodily sites. 

Targeting of compounds to subcellular compartments 
25 GENSET Polypeptides expressed in specific cellular compartments/organelels may also be 

used to target compounds to these compartments/organelles. The invention therefore encompasses 

uses of polypeptides and polynucleotides of the invention as organelle targeting tools. 

In a first embodiment, GENSET polypeptides expressed in mitochondria may be used to 

target heterologous compounds, either polypeptides or polynucleotides to mitochondria by 
30 recombinantly or chemically fusing a fragment of the protein of the invention to an heterologous 

polypeptide or polynucleotide. Preferred fragments are signal peptide, amphiphilic alpha helices 

and/or any other fragments of the protein of the invention, or part thereof, that may contain 

targeting signals for mitochondria including but not limited to matrix targeting signals as defined in 

Herrman and Neupert, (2000); Bhagwat et al. ( 1 999), Murphy ( 1 997); Glaser et al (1 998); 

35 Ciminale et al. (1 999), which disclosures are hereby incorporated by reference in their entireties. 

Such heterologous compounds may be used to modulate mitochondria's activities. For example, ! 

I 
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they may be used to induce and/or prevent mitochondrial-induced apoptosis or necrosis. In 
addition, heterologous polynucleotides may be used for mitochondrial gene therapy to replace a 
defective mitochondrial gene and/or to inhibit the deleterious expression of a mitochondrial gene. 

In a second embodiment, GENSET polypeptides expressed in the endoplasmic reticulum may 
5 be used to target heterologous polypeptides to the endoplasmic reticulum by recombinantly or 

chemically fusing a fragment of the proteins of the invention to an heterologous polypeptide. Preferred 
fragments are any fragments of the proteins of the invention, or part thereof, that may contain targeting 
signals for the endoplasmic reticulum such as those described in Pidoux and Armstrong (1992), Munro 
and Pelham (1987); Pelham (1990), which disclosures are hereby incorporated by reference in their 
10 entireties. 

In a third embodiment, GENSET polypeptides expressed in the nucleus may be used to target 
heterologous polypeptides or polynucleotides to the nucleus by recombinantly or chemically fusing a 
fragment of the proteins or polynuleotide of the invention to an heterologous polypeptide or 
polynucleotide. Preferred fragments are any fragments of the proteins or polynuclotide of the 

1 5 invention, or part thereof, that may contain targeting signals for the nucleus (nuclear localization 

signals) such as those described in Christophe et aL ( 2000), which disclosure is hereby incorporated by 
reference in its entirety. 

In a fourth embodiment, GENSET polypeptides expressed in the nucleus may be used to 
target heterologous polypeptides to the Golgi apparatus by recombinantly or chemically fusing a 

20 fragment of the protein of the invention to an heterologous polypeptide. Preferred fragments are 
signal peptide, transmembrane domains, tyrosine containing regions and/or any other fragments of 
the proteins of the invention, or part thereof, that may contain (1) targeting signals for the Golgi 
apparatus such as the ones described in Ugur and Jones, (2000); Picetti and Borrelli, (2000), (2) 
tyrosine-based Golgi targeting signal region (Zhan et al., (1998); Watson and Pessin (2000); Ward 

25 and Moss (2000), or (3) any other region as defined in Munro, (1998); Luetterforst et al., (1999); 
Essl et aL, (1 999), which disclosures are hereby incorporated by reference in their entireties. 

Screening and diagnosis of abnormal GENSET expression and/or biological activity 

Moreover, antibodies and/or primers specific for GENSET expression may also be used to 
identify abnormal GENSET expression and/or biological activity, and subsequently to screen and/or 

30 diagnose disorders associated with abnormal GENSET expression. For example, a particular 
disease may result from lack of expression, over expression, or under expression of a GENSET 
mRNA. By comparing mRNA expression patterns and quantities in samples taken from healthy 
individuals with those from individuals suffering from a particular disorder, genes responsible for 
this disorder may be identified. Primers, probes and antibodies specific for this GENSET may then 

35 be used to elaborate kits of screening and diagnosis for a disorder in which the gene of interest is 
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specifically expressed or in which its expression is specifically dysregulated, i.e. underexpressed or 
overexpressed. 



Screening for specific disorders 

The present invention also relates to methods of identifying individuals having elevated or 
5 reduced levels of GENSET, which individuals are likely to benefit from therapies to suppress or 
enhance GENSET expression, respectively. One example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence in said sample of a GENSET product (mRNA or protein) using 
any method known to those skilled in the art including those described herein, especially at the 

1 0 section entitled "Detection of GENSET products"; 

c) comparing the amount of said GENSET product present in said sample with that of a 
control sample; and 

d) determing whether said human or non-human mammal has a reduced or elevated level of 
GENSET expression compared to the control sample. 

1 5 Such individuals with reduced or elevated levels of GENSET products may be predisposed 

to disorders associated with dyregulation of GENSET gene expression and thus would be 
candidates for therapies. The identification of elevated levels of GENSET in a patient would be 
indicative of an individual that would benefit from treatment with agents that suppress GENSET 
expression or activity. The identification of low levels of GENSET in a patient would be indicative 

20 of an individual that would benefit from agents that induce GENSET expression or activity. 

Biological samples suitable for use in this method include biological fluids such as blood, 
lymph, saliva, sperm, maternal milk, and tissue samples (e.g. biopsies ) as well as cell cultures or 
cell extracts derived, for example, from tissue biopsies. The detection step of the present method 
can be performed using standard protocols for protein/mRNA detection. Examples of suitable 

25 protocols include Northern blot analysis, immunoassays (e.g. RIA, Western blots, 
immunohistochemical analyses), and PCR. 

Thus, the present invention further relates to methods of identifying individuals or non- 
human animals at increased risk for developing, or present state of having, certain 
diseases/disorders associated with GENSET abnormal expression or biological activity. One 

30 example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence or absence in said sample of a GENSET product (mRNA or 
protein); 

c) comparing the amount of said GENSET product present in said sample with that of a 
35 control sample; and 



415 



WO 01/42451 PCT/IB00/01938 

d) determing whether said human or non-human mammal is at increased risk for 
developing, or present state of having, a diseases or disorder. 

In accordance with this method, the presence in the sample of altered levels of GENSET 
product indicates that the subject is predisposed to the above-indicated diseases/disorders. 
5 Biological samples suitable for use in this method include biological fluids such as blood, lymph, 
saliva, sperm, maternal milk, and tissue samples (e.g. biopsies. 

The diagnostic methodologies described herein are applicable to both humans and 
non-human mammals. 

Detection of GENSET mutations 

1 0 The invention also encompasses methods to detect mutations in GENSET polynucleotides 

of the invention. Such methods may advantageously be used to detect mutations occurring in 
GENSET genes and preferably in their regulatory regions. When the mutation was proven to be 
associated with a disease, screening for such mutations may be used for screening and diagnosis 
purposes. 

1 5 In one embodiment of the oligonucleotide arrays of the invention, an oligonucleotide probe 

matrix may advantageously be used to detect mutations occurring in GENSET genes and preferably 
in their regulatory regions. For this particular purpose, probes are specifically designed to have a 
nucleotide sequence allowing their hybridization to the genes that carry known mutations (either by 
deletion, insertion or substitution of one or several nucleotides). By known mutations, it is meant, 

20 mutations on the GENSET genes that have been identified according, for example to the technique 
used by Huang et a/.(1996) or Samson et a/. (1996), which disclosures are hereby incorporated by 
reference in their entireties. 

Another technique that is used to detect mutations in GENSET genes is the use of a high- 
density DNA array. Each oligonucleotide probe constituting a unit element of the high density 

25 DNA array is designed to match a specific subsequence of a GENSET genomic DNA or cDNA. 
Thus, an array consisting of oligonucleotides complementary to subsequences of the target gene 
sequence is used to determine the identity of the target sequence with the wild gene sequence, 
measure its amount, and detect differences between the target sequence and the reference wild gene 
sequence of the GENSET gene. In one such design, termed 4L tiled array, is implemented a set of 

30 four probes (A, C, G, T), preferably 15-nucleotide oligomers. In each set of four probes, the perfect 
complement will hybridize more strongly than mismatched probes. Consequently, a nucleic acid 
target of length L is scanned for mutations with a tiled array containing 4L probes, the whole probe 
set containing all the possible mutations in the known wild reference sequence. The hybridization 
signals of the 15-mer probe set tiled array are perturbed by a single base change in the target 

35 sequence. As a consequence, there is a characteristic loss of signal or a "footprint" for the probes 
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flanking a mutation position. This technique was described by Chee et al. in 1996, which disclosure 
is hereby incorporated by reference in its entirety. 



Construction of DNA constructs with a GENSET expression pattern 

In addition, characterization of the spatial and temporal expression patterns and expression 
5 levels of GENSET mRNAs is also useful for constructing expression vectors capable of producing a 
desired level of gene product in a desired spatial or temporal manner, as discussed below. 

DNA Construct That Enables Directing Temporal And Spatial GENSET Gene Expression In 
Recombinant Cell Hosts And In Transgenic Animals. 

In order to study the physiological and phenotypic consequences of a lack of synthesis of a 

10 GENSET protein, both at the cell level and at the multi cellular organism level, the invention also 
encompasses DNA constructs and recombinant vectors enabling a conditional expression of a 
specific allele of a GENSET genomic sequence or cDNA and also of a copy of this genomic 
sequence or cDNA harboring substitutions, deletions, or additions of one or more bases as regards 
to a nucleotide sequence selected from the group consisting of sequences of SEQ ID Nos 1-241 and 

15 sequences of clone inserts of the deposited clone pool, or a fragment thereof, these base 

substitutions, deletions or additions being located either in an exon, an intron or a regulatory 
sequence, but preferably in the 5'-regulatory sequence or in an exon of the GENSET genomic 
sequence or within the GENSET cDNA. 

A first preferred DNA construct is based on the tetracycline resistance operon let from E. 

20 coli transposon TnlO for controlling the GENSET gene expression, such as described by Gossen et 
a/.(1992, 1995) and Furth et tf/.(1994), which disclosures are hereby incorporated by reference in 
their entireties. Such a DNA construct contains seven tet operator sequences from TnlO (tetop) that 
are fused to either a minimal promoter or a 5'-regulatory sequence of the GENSET gene, said 
minimal promoter or said GENSET regulatory sequence being operably linked to a polynucleotide 

25 of interest that codes either for a sense or an antisense oligonucleotide or for a polypeptide, 

including a GENSET polypeptide or a peptide fragment thereof. This DNA construct is functional 
as a conditional expression system for the nucleotide sequence of interest when the same cell also 
comprises a nucleotide sequence coding for either the wild type (tTA) or the mutant (rTA) repressor 
fused o the activating domain of viral protein VP 16 of herpes simplex virus, placed under the 

30 control of a promoter, such as the HCMVIE1 enhancer/promoter or the MMTV-LTR. Indeed, a 
preferred DNA construct of the invention comprise both the polynucleotide containing the tet 
operator sequences and the polynucleotide containing a sequence coding for the tTA or the rTA 
repressor. In a specific embodiment, the conditional expression DNA construct contains the 
sequence encoding the mutant tetracycline repressor rTA, the expression of the polynucleotide of 

35 interest is silent in the absence of tetracycline and induced in its presence. 
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DNA Constructs Allowing Homologous Recombination: Replacement Vectors 

A second preferred DNA construct will comprise, from 5'-end to 3 '-end: (a) a first 
nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide sequence 
comprising a positive selection marker, such as the marker for neomycine resistance (neo); and (c) a 
5 second nucleotide sequence that is comprised in the GENSET genomic sequence, and is located on 
the genome downstream the first GENSET nucleotide sequence (a). 

In a preferred embodiment, this DNA construct also comprises a negative selection marker 
located upstream the nucleotide sequence (a) or downstream the nucleotide sequence (c). 
Preferably, the negative selection marker comprises the thymidine kinase (tk) gene (Thomas et aL, 

1 0 1 986), the hygromycine beta gene (Te Riele et al , 1 990), the hprt gene ( Van der Lugt et aL, 1 99 1 ; 
Reid et aL, 1990) or the Diphteria toxin A fragment (Dt-A) gene (Nada et aL, 1993; Yagi et 
al. 1990), which disclosures are hereby incorporated by reference in their entireties. Preferably, the 
positive selection marker is located within a GENSET exon sequence so as to interrupt the sequence 
encoding a GENSET protein. These replacement vectors are described, for example, by Thomas et 

15 a/. (1986; 1987), Mansour et al.( 1988) and Koller et ai( 1992). 

The first and second nucleotide sequences (a) and (c) may be indifferently located within a 
GENSET regulatory sequence, an intronic sequence, an exon sequence or a sequence containing 
both regulatory and/or intronic and/or exon sequences. The size of the nucleotide sequences (a) and 
(c) ranges from 1 to 50 kb, preferably from 1 to 10 kb, more preferably from 2 to 6 kb and most 

20 preferably from 2 to 4 kb. 

DNA Constructs Allowing Homologous Recombination: Cre-LoxP System. 

These new DNA constructs make use of the site specific recombination system of the PI 
phage. The PI phage possesses a recombinase called Cre which interacts specifically with a 34 
base pairs lox? site. The lox? site is composed of two palindromic sequences of 13 bp separated by 

25 a 8 bp conserved sequence (Hoess et aL, 1986), which disclosure is hereby incorporated by 

reference in its entirety. The recombination by the Cre enzyme between two loxP sites having an 
identical orientation leads to the deletion of the DNA fragment. 

The Cre-/oxP system used in combination with a homologous recombination technique has 
been first described by Gu et al.(\ 993, 1 994), which disclosures are hereby incorporated by 

30 reference in their entireties. Briefly, a nucleotide sequence of interest to be inserted in a targeted 
location of the genome harbors at least two loxP sites in the same orientation and located at the 
respective ends of a nucleotide sequence to be excised from the recombinant genome. The excision 
event requires the presence of the recombinase (CYe) enzyme within the nucleus of the recombinant 
cell host. The recombinase enzyme may be brought at the desired time either by (a) incubating the 

35 recombinant cell hosts in a culture medium containing this enzyme, by injecting the Cre enzyme 
directly into the desired cell, such as described by Araki et al. (1995), which disclosure is hereby 
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incorporated by reference in its entirety, or by lipofection of the enzyme into the cells, such as 
described by Baubonis et ai. (1993), which disclosure is hereby incorporated by reference in its 
entirety; (b) transfecting the cell host with a vector comprising the Cre coding sequence operably 
linked to a promoter functional in the recombinant cell host, which promoter being optionally 
5 inducible, said vector being introduced in the recombinant cell host, such as described by Gu et 
ai (1 993) and Sauer et tf/.(1988), which disclosures are hereby incorporated by reference in their 
entireties; (c) introducing in the genome of the cell host a polynucleotide comprising the Cre coding 
sequence operably linked to a promoter functional in the recombinant cell host, which promoter is 
optionally inducible, and said polynucleotide being inserted in the genome of the cell host either by 
10 a random insertion event or an homologous recombination event, such as described by Gu et 
ar/,(1994). 

In a specific embodiment, the vector containing the sequence to be inserted in the GENSET 
gene by homologous recombination is constructed in such a way that selectable markers are flanked 
by loxP sites of the same orientation, it is possible, by treatment by the Cre enzyme, to eliminate the 

1 5 selectable markers while leaving the GENSET sequences of interest that have been inserted by an 
homologous recombination event. Again, two selectable markers are needed: a positive selection 
marker to select for the recombination event and a negative selection marker to select for the 
homologous recombination event. Vectors and methods using the Cre-/axP system are described by 
Zou et a/.(1994), which disclosure is hereby incorporated by reference in its entirety. 

20 Thus, a third preferred DNA construct of the invention comprises, from 5'-end to 3 '-end: 

(a) a first nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide 
sequence comprising a polynucleotide encoding a positive selection marker, said nucleotide 
sequence comprising additionally two sequences defining a site recognized by a recombinase, such 
as a loxP site, the two sites being placed in the same orientation; and (c) a second nucleotide 

25 sequence that is comprised in the GENSET genomic sequence, and is located on the genome 
downstream of the first GENSET nucleotide sequence (a). 

The sequences defining a site recognized by a recombinase, such as a loxP site, are 
preferably located within the nucleotide sequence (b) at suitable locations bordering the nucleotide 
sequence for which the conditional excision is sought. In one specific embodiment, two loxP sites 

30 are located at each side of the positive selection marker sequence, in order to allow its excision at a 
desired time after the occurrence of the homologous recombination event. 

In a preferred embodiment of a method using the third DNA construct described above, the 
excision of the polynucleotide fragment bordered by the two sites recognized by a recombinase, 
preferably two loxP sites, is performed at a desired time, due to the presence within the genome of 

35 the recombinant host cell of a sequence encoding the Cre enzyme operably linked to a promoter 
sequence, preferably an inducible promoter, more preferably a tissue-specific promoter sequence 
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and most preferably a promoter sequence which is both inducible and tissue-specific, such as 
described by Gu et at. (1994). 

The presence of the Cre enzyme within the genome of the recombinant cell host may result 
from the breeding of two transgenic animals, the first transgenic animal bearing the GENSET - 
5 derived sequence of interest containing the loxY> sites as described above and the second transgenic 
animal bearing the Cre coding sequence operably linked to a suitable promoter sequence, such as 
described by Gu et a/.(1994). 

Spatio-temporal control of the Cre enzyme expression may also be achieved with an 
adenovirus based vector that contains the Cre gene thus allowing infection of cells, or in vivo 
10 infection of organs, for delivery of the Cre enzyme, such as described by Anton and Graham (1995) 
and Kanegae et ah (1995), which disclosures are hereby incorporated by reference in their entireties. 

The DNA constructs described above may be used to introduce a desired nucleotide 
sequence of the invention, preferably a GENSET genomic sequence or a GENSET cDNA sequence, 
and most preferably an altered copy of a GENSET genomic or cDNA sequence, within a 
15 predetermined location of the targeted genome, leading either to the generation of an altered copy of 
a targeted gene (knock-out homologous recombination) or to the replacement of a copy of the 
targeted gene by another copy sufficiently homologous to allow an homologous recombination 
event to occur (knock-in homologous recombination). 

Modifying GENSET expression and/or biological activity 

20 Modifying endogenous GENSET expression and/or biological activity is expressly 

contemplated by the present invention. 

Screening for compounds that modulate GENSET expression and/or biological activity 

The present invention further relates to compounds able to modulate GENSET expression 
and/or biological activity and methods to use these compounds. Such compounds may interact with 
25 the regulatory sequences of GENSET genes or they may interact with GENSET polypeptides 
directly or indirectly. 

Compounds Interacting With GENSET Regulatory Sequences 

The present invention also concerns a method for screening substances or molecules that are 
able to interact with the regulatory sequences of a GENSET gene, such as for example promoter or 
30 enhancer sequences in untranscribed regions of the genomic DNA, as determined using any 
techniques known to those skilled in the art including those described in the section entitled 
^'Identification of Promoters in Cloned Upstream Sequences, or such as regulatory sequences 
located in untranslated regions of GENSET mRNA. 

Sequences within untranscribed or untranslated regions of polynucleotides of the invention 
35 may be identified by comparison to databases containing known regulatory sequence such as 
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transcription start sites, transcription factor binding sites, promoter sequences, enhancer sequences, 
5'UTR and 3TJTR elements (Pesole eta!., 2000; http://igs-server.cnrs- 

mrs.fr/~gauthere/UTR/index.html). Alternatively, the regulatory sequences of interest may be 
identified through conventional mutagenesis or deletion analyses of reporter plasmids using, for 
5 instance, techniques described in the section entitled "Identification of Promoters in Cloned 
Upstream Sequences". 

Following the identification of potential GENSET regulatory sequences, proteins which 
interact with these regulatory sequences may be identified as described below. 

Gel retardation assays may be performed independently in order to screen candidate 
10 molecules that are able to interact with the regulatory sequences of the GENSET gene, such as 

described by Fried and Crothers (1981), Garner and Revzin (1981) and Dent and Latchman (1993), 
the teachings of these publications being herein incorporated by reference. These techniques are 
based on the principle according to which a DNA or mRNA fragment which is bound to a protein 
migrates slower than the same unbound DNA or mRNA fragment. Briefly, the target nucleotide 
15 sequence is labeled. Then the labeled target nucleotide sequence is brought into contact with either 
a total nuclear extract from cells containing regulation factors, or with different candidate molecules 
to be tested. The interaction between the target regulatory sequence of the GENSET gene and the 
candidate molecule or the regulation factor is detected after gel or capillary electrophoresis through 
a retardation in the migration. 
20 Nucleic acids encoding proteins which are able to interact with the promoter sequence of 

the GENSET gene, more particularly a nucleotide sequence selected from the group consisting of 
the polynucleotides of the 5* and 3' regulatory region or a fragment or variant thereof, may be 
identified by using a one-hybrid system, such as that described in the booklet enclosed in the 
Matchmaker One-Hybrid System kit from Clontech (Catalog Ref. n° K1603-1), the technical 
25 teachings of which are herein incorporated by reference. Briefly, the target nucleotide sequence is 
cloned upstream of a selectable reporter sequence and the resulting polynucleotide construct is 
integrated in the yeast genome (Saccharomyces cerevisiae). Preferably, multiple copies of the 
target sequences are inserted into the reporter plasmid in tandem. The yeast cells containing the 
reporter sequence in their genome are then transformed with a library comprising fusion molecules 
30 between cDNAs encoding candidate proteins for binding onto the regulatory sequences of the 

GENSET gene and sequences encoding the activator domain of a yeast transcription factor such as 
GAL4. The recombinant yeast cells are plated in a culture broth for selecting cells expressing the 
reporter sequence. The recombinant yeast cells thus selected contain a fusion protein that is able to 
bind onto the target regulatory sequence of the GENSET gene. Then, the cDNAs encoding the 
35 fusion proteins are sequenced and may be cloned into expression or transcription vectors in vitro. 
The binding of the encoded polypeptides to the target regulatory sequences of the GENSET gene 
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may be confirmed by techniques familiar to the one skilled in the art, such as gel retardation assays 
or DNAse protection assays. 



Ligands interacting with GENSET polypeptides 

For the purpose of the present invention, a ligand means a molecule, such as a protein, a 
5 peptide, an antibody or any synthetic chemical compound capable of binding to a GENSET protein 
or one of its fragments or variants or to modulate the expression of the polynucleotide coding for 
GENSET or a fragment or variant thereof. 

In the ligand screening method according to the present invention, a biological sample or a 
defined molecule to be tested as a putative ligand of a GENSET protein is brought into contact with 

10 the corresponding purified GENSET protein, for example the corresponding purified recombinant 
GENSET protein produced by a recombinant cell host as described herein, in order to form a 
complex between this protein and the putative ligand molecule to be tested. 

As an illustrative example, to study the interaction of a GENSET protein, or a fragment 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 

15 preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 
group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, with drugs or small molecules, such as molecules generated 
through combinatorial chemistry approaches, the microdialysis coupled to HPLC method described 

20 by Wang et al, (1997) or the affinity capillary electrophoresis method described by Bush et al. 
(1997), the disclosures of which are incorporated by reference, can be used. 

In further methods, peptides, drugs, fatty acids, lipoproteins, or small molecules which 
interact with a GENSET protein, or a fragment comprising a contiguous span of at least 6 amino 
acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 

25 100 amino acids of a polypeptide selected from the group consisting of sequences of SEQ ID Nos: 
242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as well as full-length 
and mature polypeptides encoded by the clone inserts of the deposited clone pool may be identified 
using assays such as the following. The molecule to be tested for binding is labeled with a 
detectable label, such as a fluorescent , radioactive, or enzymatic tag and placed in contact with 

30 immobilized GENSET protein, or a fragment thereof under conditions which permit specific 

binding to occur. After removal of non-specifically bound molecules, bound molecules are detected 
using appropriate means. 

Various candidate substances or molecules can be assayed for interaction with a GENSET 
polypeptide. These substances or molecules include, without being limited to, natural or synthetic 

35 organic compounds or molecules of biological origin such as polypeptides. When the candidate 
substance or molecule comprises a polypeptide, this polypeptide may be the resulting expression 
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product of a phage clone belonging to a phage-based random peptide library, or alternatively the 
polypeptide may be the resulting expression product of a cDNA library cloned in a vector suitable 
for performing a two-hybrid screening assay. 

A. Candidate ligands obtained from random peptide libraries 
5 In a particular embodiment of the screening method, the putative ligand is the expression 

product of a DNA insert contained in a phage vector (Parmley and Smith, 1988). Specifically, 
random peptide phages libraries are used. The random DNA inserts encode for peptides of 8 to 20 
amino acids in length (Oldenburg et aL, 1992; Valadon et al, 1996; Lucas, 1994; Westerink, 1995; 
Felici et al. y 1991), which disclosures are hereby incorporated by reference in their entireties. 

10 According to this particular embodiment, the recombinant phages expressing a protein that binds to 
an immobilized GENSET protein is retained and the complex formed between the GENSET protein 
and the recombinant phage may be subsequently immunoprecipitated by a polyclonal or a 
monoclonal antibody directed against the GENSET protein. 

Once the ligand library in recombinant phages has been constructed, the phage population is 

1 5 brought into contact with the immobilized GENSET protein. Then the preparation of complexes is 
washed in order to remove the non-speciftcally bound recombinant phages. The phages that bind 
specifically to the GENSET protein are then eluted by a buffer (acid pH) or immunoprecipitated by 
the monoclonal antibody produced by the hybridoma anti-GENSET, and this phage population is 
subsequently amplified by an over-infection of bacteria (for example E. coli). The selection step 

20 may be repeated several times, preferably 2-4 times, in order to select the more specific 

recombinant phage clones. The last step comprises characterizing the peptide produced by the 
selected recombinant phage clones either by expression in infected bacteria and isolation, 
expressing the phage insert in another host-vector system, or sequencing the insert contained in the 
selected recombinant phages. 

25 B. Candidate ligands obtained by competition experiments. 

Alternatively, peptides, drugs or small molecules which bind to a GENSET protein or 
fragment thereof comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 
amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 

30 included in SEQ ID Nos: 242-272 and 274-384, as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, may be identified in competition 
experiments. In such assays, the GENSET protein, or a fragment thereof, is immobilized to a 
surface, such as a plastic plate. Increasing amounts of the peptides, drugs or small molecules are 
placed in contact with the immobilized GENSET protein, or a fragment thereof, in the presence of a 

35 detectable labeled known GENSET protein ligand. For example, the GENSET ligand may be 

detectably labeled with a fluorescent, radioactive, or enzymatic tag. The ability of the test molecule 
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to bind the GENSET protein, or a fragment thereof, is determined by measuring the amount of 
detectably labeled known ligand bound in the presence of the test molecule. A decrease in the 
amount of known ligand bound to the GENSET protein, or a fragment thereof, when the test 
molecule is present indicated that the test molecule is able to bind to the GENSET protein, or a 
5 fragment thereof. 

C. Candidate ligands obtained by affinity chromatography. 

Proteins or other molecules interacting with a GENSET protein, or a fragment thereof 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 
preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 
group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384, as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, can also be found using affinity columns which contain the 
GENSET protein, or a fragment thereof. The GENSET protein, or a fragment thereof, may be 
attached to the column using conventional techniques including chemical coupling to a suitable 
column matrix such as agarose, Affi Gel® , or other matrices familiar to those of skill in art. In 
some embodiments of this method, the affinity column contains chimeric proteins in which the 
GENSET protein, or a fragment thereof, is fused to glutathion S transferase (GST). A mixture of 
cellular proteins or pool of expressed proteins as described above is applied to the affinity column. 
Proteins or other molecules interacting with the GENSET protein, or a fragment thereof, attached to 
the column can then be isolated and analyzed on 2-D electrophoresis gel as described in Ramunsen 
et aL (1997), the disclosure of which is incorporated by reference. Alternatively, the proteins 
retained on the affinity column can be purified by electrophoresis based methods and sequenced. 
The same method can be used to isolate antibodies, to screen phage display products, or to screen 
phage display human antibodies. 

25 D. Candidate ligands obtained by optical biosensor methods 

Proteins interacting with a GENSET protein, or a fragment composing a contiguous span of 
at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 
25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the group consisting of sequences 
of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as 

30 well as full-length and mature polypeptides encoded by the clone inserts of the deposited clone 
pool, can also be screened by using an Optical Biosensor as described in Edwards and 
Leatherbarrow (1997) and also in Szabo et aL (1995), the disclosures of which are incorporated by 
reference. This technique permits the detection of interactions between molecules in real time, 
without the need of labeled molecules. This technique is based on the surface plasmon resonance 

35 (SPR) phenomenon. Briefly, the candidate ligand molecule to be tested is attached to a surface 
(such as a carboxymethyl dextran matrix). A light beam is directed towards the side of the surface 
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that does not contain the sample to be tested and is reflected by said surface. The SPR phenomenon 
causes a decrease in the intensity of the reflected light with a specific association of angle and 
wavelength. The binding of candidate ligand molecules cause a change in the refraction index on 
the surface, which change is detected as a change in the SPR signal. For screening of candidate 
5 ligand molecules or substances that are able to interact with the GENSET protein, or a fragment 
thereof, the GENSET protein, or a fragment thereof, is immobilized onto a surface. This surface 
comprises one side of a cell through which flows the candidate molecule to be assayed. The 
binding of the candidate molecule on the GENSET protein, or a fragment thereof, is detected as a 
change of the SPR signal. The candidate molecules tested may be proteins, peptides, carbohydrates, 

10 lipids, or small molecules generated by combinatorial chemistry. This technique may also be 
performed by immobilizing eukaryotic or prokaryotic cells or lipid vesicles exhibiting an 
endogenous or a recombinantly expressed GENSET protein at their surface. 

The main advantage of the method is that it allows the determination of the association rate 
between the GENSET protein and molecules interacting with the GENSET protein. It is thus 

15 possible to select specifically ligand molecules interacting with the GENSET protein, or a fragment 
thereof, through strong or conversely weak association constants. 

E. Candidate ligands obtained through a two-hybrid screening assay. 

The yeast two-hybrid system is designed to study protein-protein interactions in vivo (Fields 
and Song, 1989), which disclosure is hereby incorporated by reference in its entirety, and relies 
20 upon the fusion of a bait protein to the DNA binding domain of the yeast Gal4 protein. This 

technique is also described in the US Patent N° US 5,667,973 and the US Patent N° 5,283,173, the 
technical teachings of both patents being herein incorporated by reference. 

The general procedure of library screening by the two-hybrid assay may be performed as 
described by Harper et al. (1993) or as described by Cho et al. (1998) or also Fromont-Racine et al. 
25 (1997), which disclosures are hereby incorporated by reference in their entireties. 

The bait protein or polypeptide comprises, consists essentially of, or consists of a GENSET 
polypeptide or a fragment thereof comprising a contiguous span of at least 6 ammo acids, preferably 
at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids 
of a polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature 
30 polypeptides included in SEQ ID Nos: 242-272 and 274-384, as well as full-length and mature 
polypeptides encoded by the clone inserts of the deposited clone pool. 

More precisely, the nucleotide sequence encoding the GENSET polypeptide or a fragment 
or variant thereof is fused to a polynucleotide encoding the DNA binding domain of the GAM 
protein, the fused nucleotide sequence being inserted in a suitable expression vector, for example 
35 pAS2orpM3. 
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Then, a human cDNA library is constructed in a specially designed vector, such that the 
human cDNA insert is fused to a nucleotide sequence in the vector that encodes the transcriptional 
domain of the GAL4 protein. Preferably, the vector used is the pACT vector. The polypeptides 
encoded by the nucleotide inserts of the human cDNA library are termed "pray" polypeptides. 
5 A third vector contains a detectable marker gene, such as beta galactosidase gene or CAT 

gene that is placed under the control of a regulation sequence that is responsive to the binding of a 
complete Gal4 protein containing both the transcriptional activation domain and the DNA binding 
domain. For example, the vector pG5HC may be used. 

Two different yeast strains are also used. As an illustrative but non limiting example the 
10 two different yeast strains may be the followings : 

- Y190, the phenotype of which is (MATa, Leu2-3, 112 ura3-12, trp 1-901, his3-D200, 
ade2-101, gal4Dgall80D URA3 GAL-LacZ, LYS GAL-HIS3, cyh r ); 

- Y187, the phenotype of which is (MATa gal4 gal80 his3 trp 1-901 ade2-101 ura3-52 leu2- 
3, -1 12 URA3 GAL-lacZmet"), which is the opposite mating type of Y 190. 

15 Briefly, 20 ug of pAS2/GENSET and 20 ug of pACT-cDNA library are co-transformed 

into yeast strain Y190. The trans forrnants are selected for growth on minimal media lacking 
histidine, leucine and tryptophan, but containing the histidine synthesis inhibitor 3 -AT (50 mM). 
Positive colonies are screened for beta galactosidase by filter lift assay. The double positive 
colonies (His + , beta-gal + ) are then grown on plates lacking histidine, leucine, but containing 

20 tryptophan and cycloheximide (10 mg/ml) to select for loss of pAS2/GENSET plasmids but 

retention of pACT-cDNA library plasmids. The resulting Y190 strains are mated with Y187 strains 
expressing GENSET or non-related control proteins; such as cyclophilin B, lamin, or SNF1, as Gal4 
fusions as described by Harper et aL (1993) and by Bram et al. (1993), which disclosures are hereby 
incorporated by reference in their entireties, and screened for beta galactosidase by filter lift assay. 

25 Yeast clones that are beta gal- after mating with the control Gal4 fusions are considered false 
positives. 

In another embodiment of the two-hybrid method according to the invention, interaction 
between the GENSET or a fragment or variant thereof with cellular proteins may be assessed using 
the Matchmaker Two Hybrid System 2 (Catalog No. K1604-1, Clontech). As described in the 

30 manual accompanying the kit, the disclosure of which is incorporated herein by reference, nucleic 
acids encoding the GENSET protein or a portion thereof, are inserted into an expression vector such 
that they are in frame with DNA encoding the DNA binding domain of the yeast transcriptional 
activator GAL4. A desired cDNA, preferably human cDNA, is inserted into a second expression 
vector such that they are in frame with DNA encoding the activation domain of GAL4. The two 

35 expression plasmids are transformed into yeast and the yeast are plated on selection medium which 
selects for expression of selectable markers on each of the expression vectors as well as GAL4 
dependent expression of the HIS3 gene. Transformants capable of growing on medium lacking 
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histidine are screened for GAL4 dependent lacZ expression. Those cells which are positive in both the 
histidine selection and the lacZ assay contain interaction between GEN SET and the protein or peptide 
encoded by the initially selected cDNA insert. 

Compounds Modulating GENSET biological activity 
5 Another method of screening for compounds that modulate GENSET gene expression 

and/or biological activity is by measuring the effects of test compounds on a given cellular property 
in a host cell, such as apoptosis, proliferation, differentiation, protein glycosylation, etc... using a 
variety of techniques known to those skilled in the art including those described herein and 
especially in the section entitled "Erreur! Source du renvoi introuvable.". 

10 In one embodiment, the present invention relates to a method of identifying an agent which 

alters GENSET activity, wherein a nucleic acid construct comprising a nucleic acid which encodes 
a mammalian GENSET polypeptide is introduced into a host cell. The host cells produced are 
maintained under conditions appropriate for expression of the encoded mammalian GENSET 
polypeptides, whereby the nucleic acid is expressed. The host cells are then contacted with a 

1 5 compound to be assessed (an agent) and the given cellular property of the cells is detected in the 
presence of the compound to be assessed. Detection of a change in the given cellular property in 
the presence of the agent indicates that the agent alters GENSET activity. 

In a particular embodiment, the invention relates to a method of identifying an agent which 
is an activator of GENSET activity, wherein detection of a change of the given cellular property in 
20 the presence of the agent indicates that the agent activates GENSET activity. In another particular 
embodiment, the invention relates to a method of identifying an agent which is an inhibitor of 
GENSET activity, wherein detection of a change of the given cellular property in the presence of 
the agent indicates that the agent inhibits GENSET activity. 

Methods of Screening for Compounds Modulating GENSET Expression and/or A ctivity 
25 The present invention also relates to methods of screening compounds for their ability to 

modulate (e.g. increase or inhibit) the activity or expression of GENSET. More specifically, the 
present invention relates to methods of testing compounds for their ability either to increase or to 
decrease expression or activity of GENSET. The assays are performed in vitro or in vivo. 

In vitro methods 

30 In vitro, cells expressing GENSET are incubated in the presence and absence of the test 

compound. By determining the level of GENSET expression in the presence of the test compound 
or the level of GENSET activity in the presence of the test compound, compounds can be identified 
that suppress or enhance GENSET expression or activity. Alternatively, constructs comprising a 
GENSET regulatory sequence operably linked to a reporter gene (e.g. luciferase, chloramphenicol 

35 acetyl transferase, LacZ, green fluorescent protein, etc.) can be introduced into host cells and the 
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effect of the test compounds on expression of the reporter gene detected- Cells suitable for use in 
the foregoing assays include, but are not limited to, cells having the same origin as tissues or cell 
lines in which the polypeptide is known to be expressed using the data from Table IX. 

Consequently, the present invention encompasses a method for screening molecules that 
5 modulate the expression of a GENSET gene, said screening method comprising the steps of: 

a) cultivating a prokaryotic or an eukaryotic cell that has been transfected with a nucleotide 
sequence encoding a GENSET protein or a variant or a fragment thereof, placed under the control 
of its own promoter; 

b) bringing into contact said cultivated cell with a molecule to be tested; 

1 0 c) quantifying the expression of said GENSET protein or a variant or a fragment thereof in 

the presence of said molecule. 

Using DNA recombination techniques well known by the one skill in the art, the GENSET 
protein encoding DNA sequence is inserted into an expression vector, downstream from its 
promoter sequence. As an illustrative example, the promoter sequence of the GENSET gene is 
15 contained in the 5' untranscribed region of the GENSET genomic DNA. 

The quantification of the expression of a GENSET protein may be realized either at the 
mRNA level (using for example Northen blots, RT-PCR, preferably quantitative RT-PCR with 
primers and probes specific for the GENSET mRNA of interest) or at the protein level (using 
polyclonal or monoclonal antibodies in immunoassays such as ELISA or RIA assays, Western blots, 
20 or immunochemistry). 

The present invention also concerns a method for screening substances or molecules that are 
able to increase, or in contrast to decrease, the level of expression of a GENSET gene. Such a 
method may allow the one skilled in the art to select substances exerting a regulating effect on the 
expression level of a GENSET gene and which may be useful as active ingredients included in 
25 pharmaceutical compositions for treating patients suffering from disorders associated with abnormal 
levels of GENSET products. 

Thus, also part of the present invention is a method for screening a candidate molecule that 
modulates the expression of a GENSET gene, this method comprises the following steps: 

a) providing a recombinant cell host containing a nucleic acid, wherein said nucleic acid 
30 comprises a GENSET 5' regulatory region or a regulatory active fragment or variant thereof, 

operably linked to a polynucleotide encoding a detectable protein; 

b) obtaining a candidate molecule; and 

c) determining the ability of said candidate molecule to modulate the expression levels of 
said polynucleotide encoding the detectable protein. 

35 In a further embodiment, said nucleic acid comprising a GENSET 5 ' regulatory region or a 

regulatory active fragment or variant thereof, includes the 5'UTR region of a GENSE T cDNA 
selected from the group comprising of the 5'UTRs of the sequences of SEQ ID Nos 1-241, 
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sequences of clones inserts of the deposited clone pool, regulatory active fragments and variants 
thereof. In a more preferred embodiment of the above screening method, said nucleic acid includes 
a promoter sequence which is endogenous with respect to the GENSET 5'UTR sequence. In 
another more preferred embodiment of the above screening method, said nucleic acid includes a 
5 promoter sequence which is exogenous with respect to the GENSET 5'UTR sequence defined 
therein. 

Preferred polynucleotides encoding a detectable protein are polynucleotides encoding beta 
galactosidase, green fluorescent protein (GFP) and chloramphenicol acetyl transferase (CAT). 
The invention further relates to a method for the production of a pharmaceutical 
10 composition comprising a method of screening a candidate molecule that modulates the expression 
of a GENSET gene and furthermore mixing the identified molecule with a pharmaceutical ly 
acceptable carrier. 

The invention also pertains to kits for the screening of a candidate substance modulating the 
expression of a GENSET gene. Preferably, such kits comprise a recombinant vector that allows the 
1 5 expression of a GENSET 5 ' regulatory region or a regulatory active fragment or a variant thereof, 
operably linked to a polynucleotide encoding a detectable protein or a GENSET protein or a 
fragment or a variant thereof. More preferably, such kits include a recombinant vector that 
comprises a nucleic acid including the 5'UTR region of a GENSET cDNA selected from the group 
comprising the 5'UTRs of the sequences of SEQ ID Nos 1-241, sequences of clones inserts of the 
20 deposited clone pool, regulatory active fragments and variants thereof, being operably linked to a 
polynucleotide encoding a detectable protein. 

For the design of suitable recombinant vectors useful for performing the screening methods 
described above, it will be referred to the section of the present specification wherein the preferred 
recombinant vectors of the invention are detailed. 
25 Another object of the present invention comprises methods and kits for the screening of 

candidate substances that interact with a GENSET polypeptide, fragments or variants thereof. By 
their capacity to bind covalently or non-covalently to a GENSET protein, fragments or variants 
thereof, these substances or molecules may be advantageously used both in vitro and in vivo. 

In vitro, said interacting molecules may be used as detection means in order to identify the 
30 presence of a GENSET protein in a sample, preferably a biological sample. 

A method for the screening of a candidate substance that interact with a GENSET 
polypeptide, fragments or variants thereof, said methods comprising the following steps: 

a) providing a polypeptide comprising, consisting essentially of, or consisting of a GENSET 
protein or a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 
35 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a 

polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
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included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool; 

b) obtaining a candidate substance; 

c) bringing into contact said polypeptide with said candidate substance; 

5 d) detecting the complexes formed between said polypeptide and said candidate substance. 

The invention further relates to a method for the production of a pharmaceutical 
composition comprising a method for the screening of a candidate substance that interact with a 
GENSET polypeptide, fragments or variants thereof and furthermore mixing the identified 
substance with a pharmaceutically acceptable carrier. 
10 The invention further concerns a kit for the screening of a candidate substance interacting 

with the GENSET polypeptide, wherein said kit comprises: 

a) a polypeptide comprising, consisting essentially of, or consisting of a GENSET protein or 
a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino 
acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 

1 5 selected from the group consisting of sequences of SEQ ID Nos: 242^82, mature polypeptides included 
in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool; and 

b) optionally means useful to detect the complex formed between said polypeptide or a 
variant thereof and the candidate substance. 

20 In a preferred embodiment of the kit described above, the detection means comprises a 

monoclonal or polyclonal antibody binding to said GENSET protein or fragment or variant thereof. 

In vivo methods 

Compounds that suppress or enhance GENSET expression can also be identified using in 
vivo screens. In these assays, the test compound is administered (e.g. IV, IP, LM, orally, or 

25 otherwise), to the animal, for example, at a variety of dose levels. The effect of the compound on 
GENSET expression is determined by comparing GENSET levels, for example in tissues known to 
express the gene of interest using, for example the data obtained in Table IX, and using Northern 
blots, immunoassays, PCR, etc., as described above. Suitable test animals include rodents (e.g., 
mice and rats), primates, mammals. Humanized mice can also be used as test animals, that is mice 

30 in which the endogenous mouse protein is ablated (knocked out) and the homologous human 

protein added back by standard transgenic approaches. Such mice express only the human form of 
a protein. Humanized mice expressing only the human GENSET can be used to study in vivo 
responses to potential agents regulating GENSET protein or mRNA levels. As an example, 
transgenic mice have been produced carrying the human apoE4 gene. They are then bred with a 

35 mouse line that lacks endogenous apoE, to produce an animal model carrying human proteins 

believed to be instrumental in development of Alzheimer's pathology. Such transgenic animals are 

useful for dissecting the biochemical and physiological steps of disease, and for development of 
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therapies for disease intervention (Loring, et ai, 1996) (incorporated herein by reference in its 
entirety). 



Uses for compounds modulating GENSET expression and/or biological activity 

Using in vivo (or in vitro) systems, it may be possible to identify compounds that exert a 
5 tissue specific effect, for example, that increase GENSET expression or activity only in tissues of 
interest. Screening procedures such as those described above are also useful for identifying agents 
for their potential use in pharmacological intervention strategies. Agents that enhance GENSET 
expression or stimulate its activity may thus be used to treat disorders which require upregulated 
levels of GENSET gene expression and/or activity. Compounds that suppress GENSET expression 
10 or inhibit its activity can be used to treat disorders which require downregulated levels of GENSET 
gene expression and/or activity. 

Also encompassed by the present invention is an agent which interacts with GENSET 
directly or indirectly, and inhibits or enhances GENSET expression and/or function. In one 
embodiment, the agent is an inhibitor which interferes with GENSET directly (e.g., by binding 
1 5 GENSET) or indirectly (e.g., by blocking the ability of GENSET to have a GENSET biological 
activity). In a particular embodiment, an inhibitor of GENSET protein is an antibody specific for 
GENSET protein or a functional portion of GENSET; that is, the antibody binds a GENSET 
polypeptide. For example, the antibody can be specific for a polypeptide encoded by one of the 
amino acid sequences of human GENSET genes (SEQ ID Nos: 242-482, mature polypeptides 
20 included in SEQ ID Nos: 242-272 and 274-384, full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool), mammal GENSET or portions thereof. Alternatively, the 
inhibitor can be an agent other than an antibody (e.g., small organic molecule, protein or peptide) 
which binds GENSET and blocks its activity. For example, the inhibitor can be an agent which 
mimics GENSET structurally, but lacks its function. Alternatively, it can be an agent which binds 
25 to or interacts with a molecule which GENSET normally binds with or interacts with, thus blocking 
GENSET from doing so and preventing it from exerting the effects it would normally exert. 

In another embodiment, the agent is an enhancer (activator) of GENSET which increases 
the activity of GENSET (increases the effect of a given amount or level of GENSET), increases the 
length of time it is effective (by preventing its degradation or otherwise prolonging the time during 
30 which it is active) or both either directly or indirectly. 

The GENSET sequences of the present invention can also be used to generate nonhuman 
gene knockout animals, such as mice, which lack a GENSET gene or transgenically overexpress 
GENSET. For example, such GENSET gene knockout mice can be generated and used to obtain 
further insight into the function of GENSET as well as assess the specificity of GENSET activators 
35 and inhibitors. Also, over expression of GENSET (e.g., human GENSET) in transgenic mice can 
be used as a means of creating a test system for GENSET activators and inhibitors (e.g., against 
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human GENSET). In addition, the GENSET gene can be used to clone the GENSET 
promoter/enhancer in order to identify regulators of GENSET transcription. GENSET gene 
knockout animals include animals which completely or partially lack the GENSET gene and/or 
GENSET activity or function. Thus the present invention relates to a method of inhibiting (partially 
5 or completely) GENSET biological activty in a mammal (e.g., human) comprising administering to 
the mammal an effective amount of an inhibitor of GENSET. The invention also relates to a 
method of enhancing GENSET biological activity in a mammal comprising administering to the 
mammal an effective amount of an enhancer GENSET. 

Inhibiting GENSET expression 
10 Therapeutic compositions according to the present invention may comprise advantageously 

one or several GENSET oligonucleotide fragments as an antisense tool or a triple helix tool that 
inhibits the expression of the corresponding GENSET gene. 

Antisense Approach 

In antisense approaches, nucleic acid sequences complementary to an mRNA are hybridized 
1 5 to the mRNA intracellularly, thereby blocking the expression of the protein encoded by the mRNA. 
The antisense nucleic acid molecules to be used in gene therapy may be either DNA or RNA 
sequences. Preferred methods using antisense polynucleotide according to the present invention are 
the procedures described by Sczakiel et a/,(1995), which disclosure is hereby incorporated by 
reference in its entirety. 

20 Preferably, the antisense tools are chosen among the polynucleotides ( 1 5-200 bp long) that 

are complementary to GENSET mRNA, more preferably to the 5'end of the GENSET mRNA. In 
another embodiment, a combination of different antisense polynucleotides complementary to 
different parts of the desired targeted gene are used. 

Other preferred antisense polynucleotides according to the present invention are sequences 

25 complementary to either a sequence of GENSET mRNAs comprising the translation initiation 

codon ATG or a sequence of GENSET genomic DNA containing a splicing donor or acceptor site. 

Preferably, the antisense polynucleotides of the invention have a 3' polyadenylation signal 
that has been replaced with a self-cleaving ribozyme sequence, such that RNA polymerase II 
transcripts are produced without poly(A) at their 3' ends, these antisense polynucleotides being 

30 incapable of export from the nucleus, such as described by Liu et aL(\ 994), which disclosure is 
hereby incorporated by reference in its entirety. In a preferred embodiment, these GENSET 
antisense polynucleotides also comprise, within the ribozyme cassette, a histone stem-loop structure 
to stabilize cleaved transcripts against 3'-5' exonucleolytic degradation, such as the structure 
described by Eckner et a/.(1991), which disclosure is hereby incorporated by reference in its 

35 entirety. 
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The antisense nucleic acids should have a length and melting temperature sufficient to 
permit formation of an intracellular duplex having sufficient stability to inhibit the expression of the 
GENSET mRNA in the duplex. Strategies for designing antisense nucleic acids suitable for use in 
gene therapy are disclosed in Green et aL, (1986) and Izant and Weintraub, (1984), the disclosures 
5 of which are incorporated herein by reference. 

In some strategies, antisense molecules are obtained by reversing the orientation of the 
GENSET coding region with respect to a promoter so as to transcribe the opposite strand from that 
which is normally transcribed in the cell. The antisense molecules may be transcribed using in vitro 
transcription systems such as those which employ T7 or SP6 polymerase to generate the transcript. 
10 Another approach involves transcription of GENSET antisense nucleic acids in vivo by operably 
linking DNA containing the antisense sequence to a promoter in a suitable expression vector. 

Alternatively, oligonucleotides which are complementary to the strand normally transcribed 
in the cell may be synthesized in vitro. Thus, the antisense nucleic acids are complementary to the 
corresponding mRNA and are capable of hybridizing to the mRNA to create a duplex. In some 
15 embodiments, the antisense sequences may contain modified sugar phosphate backbones to increase 
stability and make them less sensitive to RNase activity. Examples of modifications suitable for use 
in antisense strategies include T O-methyl RNA oligonucleotides and Protein-nucleic acid (PNA) 
oligonucleotides. Further examples are described by Rossi et al, (1991), which disclosure is hereby 
incorporated by reference in its entirety. 
20 Various types of antisense oligonucleotides complementary to the sequence of the GENSET 

cDNA or genomic DNA may be used. In one preferred embodiment, stable and semi-stable 
antisense oligonucleotides described in International Application No. PCT WO94/23026, hereby 
incorporated by reference, are used. In these molecules, the 3' end or both the 3' and 5' ends are 
engaged in intramolecular hydrogen bonding between complementary base pairs. These molecules 
25 are better able to withstand exonuclease attacks and exhibit increased stability compared to 
conventional antisense oligonucleotides. 

In another preferred embodiment, the antisense oligodeoxynucleotides against herpes 
simplex virus types 1 and 2 described in International Application No. WO 95/04141, hereby 
incorporated by reference, are used. 
30 In yet another preferred embodiment, the covalently cross-linked antisense oligonucleotides 

described in International Application No. WO 96/31523, hereby incorporated by reference, are 
used. These double- or single-stranded oligonucleotides comprise one or more, respectively, inter - 
or intra-oligonucleotide covalent cross-linkages, wherein the linkage consists of an amide bond 
between a primary amine group of one strand and a carboxyl group of the other strand or of the 
35 same strand, respectively, the primary amine group being directly substituted in the 2* position of 
the strand nucleotide monosaccharide ring, and the carboxyl group being carried by an aliphatic 
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spacer group substituted on a nucleotide or nucleotide analog of the other strand or the same strand, 
respectively. 

The antisense oligodeoxynucleotides and oligonucleotides disclosed in International 
Application No. WO 92/18522, incorporated by reference, may also be used. These molecules are 
5 stable to degradation and contain at least one transcription control recognition sequence which binds 
to control proteins and are effective as decoys therefor. These molecules may contain "hairpin" 
structures, "dumbbell" structures, "modified dumbbell" structures, "cross-linked" decoy structures 
and "loop" structures. 

In another preferred embodiment, the cyclic double-stranded oligonucleotides described in 
10 European Patent Application No. 0 572 287 A2, hereby incorporated by reference are used. These 
ligated oligonucleotide "dumbbells" contain the binding site for a transcription factor and inhibit 
expression of the gene under control of the transcription factor by sequestering the factor. 

Use of the closed antisense oligonucleotides disclosed in International Application No. WO 
92/19732, hereby incorporated by reference, is also contemplated. Because these molecules have 
15 no free ends, they are more resistant to degradation by exonucleases than are conventional 

oligonucleotides. These oligonucleotides may be multifunctional, interacting with several regions 
which are not adjacent to the target mRNA. 

The appropriate level of antisense nucleic acids required to inhibit gene expression may be 
determined using in vitro expression analysis. The antisense molecule may be introduced into the 
20 cells by diffusion, injection, infection or transfection using procedures known in the art. For 
example, the antisense nucleic acids can be introduced into the body as a bare or naked 
oligonucleotide, oligonucleotide encapsulated in lipid, oligonucleotide sequence encapsidated by 
viral protein, or as an oligonucleotide operably linked to a promoter contained in an expression 
vector. The expression vector may be any of a variety of expression vectors known in the art, 
25 including retroviral or viral vectors, vectors capable of extrachromosomal replication, or integrating 
vectors. The vectors may be DNA or RNA. 

The antisense molecules are introduced onto cell samples at a number of different 
concentrations preferably between lxlO" ,0 M to lxlO^M. Once the minimum concentration that can 
adequately control gene expression is identified, the optimized dose is translated into a dosage 
30 suitable for use in vivo. For example, an inhibiting concentration in culture of 1 x 10 7 translates into 
a dose of approximately 0.6 mg/kg bodyweight. Levels of oligonucleotide approaching 100 mg/kg 
bodyweight or higher may be possible after testing the toxicity of the oligonucleotide in laboratory 
animals. It is additionally contemplated that cells from the vertebrate are removed, treated with the 
antisense oligonucleotide, and reintroduced into the vertebrate. 
35 In a preferred application of this invention, the polypeptide encoded by the gene is first 

identified, so that the effectiveness of antisense inhibition on translation can be monitored using 
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techniques that include but are not limited to antibody-mediated tests such as RIAs and ELISA, 
functional assays, or radiolabeling. 

An alternative to the antisense technology that is used according to the present invention 
comprises using ribozymes that will bind to a target sequence via their complementary 
5 polynucleotide tail and that will cleave the corresponding RNA by hydrolyzing its target site 
(namely "hammerhead ribozymes")- Briefly, the simplified cycle of a hammerhead ribozyme 
comprises (1) sequence specific binding to the target RNA via complementary antisense sequences; 
(2) site-specific hydrolysis of the cleavable motif of the target strand; and (3) release of cleavage 
products, which gives rise to another catalytic cycle. Indeed, the use of long-chain antisense 

10 polynucleotide (at least 30 bases long) or ribozymes with long antisense arms are advantageous. A 
preferred delivery system for antisense ribozyme is achieved by covalently linking these antisense 
ribozymes to lipophilic groups or to use liposomes as a convenient vector. Preferred antisense 
ribozymes according to the present invention are prepared as described by Rossi et al, ( 1 99 1 ) and 
Sczakiel et ar/.(1995), the specific preparation procedures being referred to in said articles being 

1 5 herein incorporated by reference. 

Triple Helix Approach 

The GENSET genomic DNA may also be used to inhibit the expression of the GENSET 
gene based on intracellular triple helix formation. 

Triple helix oligonucleotides are used to inhibit transcription from a genome. They are 

20 particularly useful for studying alterations in cell activity when it is associated with a particular 
gene. The GENSET cDNAs or genomic DNAs of the present invention or, more preferably, a 
fragment of those sequences, can be used to inhibit gene expression in individuals having diseases 
associated with expression of a particular gene. Similarly, a portion of the GENSET genomic DNA 
can be used to study the effect of inhibiting GENSET transcription within a cell. Traditionally, 

25 homopunne sequences were considered the most useful for triple helix strategies. However, 
homopyrimidine sequences can also inhibit gene expression. Such homopyrimidine 
oligonucleotides bind to the major groove at homopurine: homopyrimidine sequences. Thus, both 
types of sequences from the GENSET genomic DNA are contemplated within the scope of this 
invention. 

30 To carry out gene therapy strategies using the triple helix approach, the sequences of the 

GENSE T genomic DNA are first scanned to identify 10-mer to 20-mer homopyrimidine or 
homopurine stretches which could be used in tnple-helix based strategies for inhibiting GENSET 
expression. Following identification of candidate homopyrimidine or homopurine stretches, their 
efficiency in inhibiting GENSET expression is assessed by introducing varying amounts of 

35 oligonucleotides containing the candidate sequences into tissue culture cells which express the 
GENSET gene. 
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The oligonucleotides can be introduced into the cells using a variety of methods known to 
those skilled in the art, including but not limited to calcium phosphate precipitation, DEAE- 
Dextran, electroporation, liposome-mediated transfection or native uptake. 

Treated cells are monitored for altered cell function or reduced GENSET expression using 
5 techniques such as Northern blotting, RNase protection assays, or PGR based strategies to monitor 
the transcription levels of the GENSET gene in cells which have been treated with the 
oligonucleotide. The cell functions to be monitored are predicted based upon the homologies of the 
target gene corresponding to the cDNA from which the oligonucleotide was derived with known 
gene sequences that have been associated with a particular function. The cell functions can also be 
10 predicted based on the presence of abnormal physiology within cells derived from individuals with 
a particular inherited disease, particularly when the cDNA is associated with the disease using 
techniques described in the section entitled "Identification of genes associated with hereditary 
diseases or drug response". 

The oligonucleotides which are effective in inhibiting gene expression in tissue culture cells 
1 5 may then be introduced in vivo using the techniques and at a dosage calculated based on the in vitro 
results, as described in the section entitled "Antisense Approach". 

In some embodiments, the natural (beta) anomers of the oligonucleotide units can be 
replaced with alpha anomers to render the oligonucleotide more resistant to nucleases. Further, an 
intercalating agent such as ethidium bromide, or the like, can be attached to the 3* end of the alpha 
20 oligonucleotide to stabilize the triple helix. For information on the generation of oligonucleotides 
suitable for triple helix formation see Griffin et a/.(1989), which is hereby incorporated by this 
reference. 

Treating GENSET-related disorders 

The present invention further relates to methods of treating diseases/disorders by increasing 
25 GENSET activity and/or expression. The invention also relates to methods of treating 

diseases/disorders by decreasing GENSET activity and or expression. These methodologies can be 
effected using compounds selected using screening protocols such as those described herein and/or 
by using the gene therapy and antisense approaches described in the art and herein. Gene therapy 
can be used to effect targeted expression of GENSET. The GENSET coding sequence can be 
30 cloned into an appropriate expression vector and targeted to a particular cell type(s) to achieve 
efficient, high level expression. Introduction of the GENSET coding sequence into target cells can 
be achieved, for example, using particle mediated DNA delivery, (Haynes, 1 996 and Maurer, 1 999), 
direct injection of naked DNA, (Levy et ai, 1996; and Feigner, 1996), or viral vector mediated 
transport (Smith et al, 1996, Stone et ai, 2000; Wu and Atai, 2000), each of which disclosures are 
35 hereby incorporated by reference in their entireties . Tissue specific effects can be achieved, for 
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example, in the case of virus mediated transport by using viral vectors that are tissue specific, or by 
the use of promoters that are tissue specific. 

Combinatorial approaches can also be used to ensure that the GENSET coding sequence is 
activated in the target tissue (Butt and Karathanasis, 1 995; Miller and Whelan, 1997), which 
5 disclosures are hereby incorporated by reference in their entireties. Antisense oligonucleotides 
complementary to GENSET mRNA can be used to selectively diminish or ablate the expression of 
the protein, for example, at sites of inflammation. More specifically, antisense constructs or 
antisense oligonucleotides can be used to inhibit the production of GENSET in high expressing 
cells such as those cited in the third column of Table X. Antisense mRNA can be produced by 

10 transfecting into target cells an expression vector with the GENSET gene sequence, or portion 
thereof, oriented in an antisense direction relative to the direction of transcription. Appropriate 
vectors include viral vectors, including retroviral, adenoviral, and adeno-associated viral vectors, as 
well as nonviral vectors. Tissue specific promoters can be used. Alternatively, antisense 
oligonucleotides can be introduced directly into target cells to achieve the same goal. (See also 

1 5 other delivery methodologies described herein in connection with gene therapy.). Oligonucleotides 
can be selected/designed to achieve a high level of specificity (Wagner et ai, 1996), which 
disclosure is hereby incorporated by reference in its entirety. The therapeutic methodologies 
described herein are applicable to both human and non-human mammals (including cats and dogs). 

Pharmaceutical and physiologically acceptable compositions 

20 The present invention also relates to pharmaceutical or physiologically acceptable 

compositions comprising, as active agent, the polypeptides, nucleic acids or antibodies of the 
invention. The invention also relates to compositions comprising, as active agent, compounds 
selected using the above-described screening protocols. Such compositions include the active agent 
in combination with a pharmaceutical or physiologically acceptably acceptable carrier. In the case 

25 of naked DNA, the "carrier" may be gold particles. The amount of active agent in the composition 
can vary with the agent, the patient and the effect sought. Likewise, the dosing regimen can vary 
depending on the composition and the disease/disorder to be treated. 

Therefore, the invention related to methods for the production of pharmaceutical 
composition comprising a method for selecting an active agent, compound, substance or molecule 

30 using any of the screening method described herein and furthermore mixing the identified active 
agent, compound, substance or molecule with a pharmaceutical ly acceptable carrier. 

The pharmaceutical compositions utilized in this invention may be administered by any 
number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, 
intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, 

35 enteral, topical, sublingual, or rectal means. In addition to the active ingredients, these 

pharmaceutical compositions may contain suitable pharmaceutically acceptable carriers comprising 
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excipients and auxiliaries which facilitate processing of the active compounds into preparations 
which can be used pharmaceutical^. Further details on techniques for formulation and 
administration may be found in the latest edition of Remington's Pharmaceutical Sciences (Maack 
PublishingCo. Easton, Pa). 
5 Pharmaceutical compositions for oral administration can be formulated using 

pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the 
patient. 

10 Pharmaceutical preparations for oral use can be obtained through a combination of active 

compounds with solid excipient, suiting mixture is optionally grinding, and processing the mixture 
of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as methyl cellulose, 

1 5 hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums including arabic and 
tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing 
agents may be added, such as the cross -linked polyvinyl pyrrol idone, agar, alginic acid, or a salt 
thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated sugar 

20 solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, 

polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and suitable organic solvents or 
solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for product 
identification or to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 

25 gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders, such as lactose or 
starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, 
liquid, or liquidpolyethylene glycol with or without stabilizers. 

30 Pharmaceutical formulations suitable for parenteral administration may be formulated in 

aqueous solutions, preferably in physiologically compatible buffers such as Hanks solution, 
Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethylcellulose, 
sorbitol, or dextran. Additionally, suspensions of the active compounds may be prepared as 

35 appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. 
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Optionally, the suspension may also contain suitable stabilizers or agents which increase the 
solubility of the compounds to allow for the preparation of highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
5 The pharmaceutical compositions of the present invention may be manufactured in a 

manner that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophili/ing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, etc. 

10 Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free 
base forms. In other cases, the preferred preparation may be a lyophilized powder which may 
contain any or all of the following: 1-50 mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at a 
pH range of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an 

15 appropriate container and labeled for treatment of an indicated condition. For administration of 
GENSET, such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions wherein 
the active ingredients are contained in an effective amount to achieve the intended purpose. The 
determination of an effective dose is well within the capability of those skilled in the art. 

20 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or pigs. 
The animal model may also be used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

25 A therapeutically effective dose refers to that amount of active ingredient, for example 

GENSET or fragments thereof, antibodies of GENSET, agonists, antagonists or inhibitors of 
GENSET, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 

30 50% of the population). The dose ratio between therapeutic and toxic effects is the therapeutic 
index, and it can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which 
exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The dosage contained in 
such compositions is preferably within a range of circulating concentrations that include the ED50 

35 with little or no toxicity. The dosage vanes within this range depending upon the dosage form 
employed, sensitivity of the patient, and the route of administration. 
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The exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels 
of the active moiety or to maintain the desired effect. Factors which may be taken into account 
include the severity of the disease state, general health of the subject, age, weight, and gender of the 
5 subject, diet, time and frequency of administration, drug combination(s), reaction sensitivities, and 
tolerance/response to therapy. Long-acting pharmaceutical compositions maybe administered every 
3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the 
particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 
10 about 1 g, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 

15 Uses of GENSET sequences: computer-Rela ted Embodiments 

As used herein the term " cDNA codes of SEQ ID Nos: 1 -241 " encompasses the nucleotide 
sequences of SEQ ID Nos: 1-241 and of clones inserts of the deposited clone pool, fragments 
thereof, nucleotide sequences homologous thereto, and sequences complementary to all of the 
preceding sequences. The fragments include fragments of SEQ ID Nos: 1-241 comprising at least 

20 8, 10, 12, 15, 18,20,25,28,30,35,40,50,75, 100, 150,200,300,400,500, 1000 or 2000 

consecutive nucleotides of SEQ ID Nos: 1-241 . Preferably the fragments include signal sequences 
and coding sequences for mature polypeptides of SEQ ID Nos: 1 -31 and 33-143, polynucleotides 
described in Tables Va and Table Vb, polynucleotides encoding polypeptides described in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 

25 comprising at least 8, 1 0, 1 2, 1 5, 1 8, 20, 25, 28, 30, 35, 40, 50, 75, 1 00, 1 50, 200, 300, 400, 500, 
1000 or 2000 consecutive nucleotides of the signal sequences or coding sequences for mature 
polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides encoding polypeptides described in Table VI, and polynucleotide described 
herein as encoding polypeptides having a biological activity. Homologous sequences and fragments 

30 of SEQ ID Nos: 1-241 refer to a sequence having at least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 
80%, or 75% identity to these sequences. Identity may be determined using any of the computer 
programs and parameters described herein, including BLAST2N with the default parameters or with 
any modified parameters. Homologous sequences also include RNA sequences in which uridines 
replace the thymines in the cDNA codes of SEQ ED Nos: 1 -241 . The homologous sequences may 

35 be obtained using any of the procedures descnbed herein or may result from the correction of a 
sequencing error as descnbed above. Preferably the homologous sequences and fragments of SEQ 
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ID Nos: 1-241 include polynucleotides homologous to signal sequences and coding sequences for 
mature polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and 
Table Vb, polynucleotides encoding a polypeptide fragment described as a domain in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 
comprising at least 8, 10, 12, 15, 18,20,25,28,30,35,40, 50,75, 100, 150, 200, 300,400, 500, 
1000 or 2000 consecutive nucleotides of the signal sequences and coding sequences for mature 
polypeptides of SEQ ID Nos: 1 -3 1 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides descnbed in Table VI, and polynucleotide described herein as encoding 
polypeptides having a biological activity. It will be appreciated that the cDNA codes of SEQ ID 
Nos: 1-241 can be represented in the traditional single character format (See the inside back cover 
of Styer, 1995) or in any other format which records the identity of the nucleotides in a sequence. 

As used herein the term " polypeptide codes of SEQ ID Nos: 242-482 " encompasses the 
polypeptide sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242- 
272 and 274-384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full- 
length, signal peptides and mature polypeptide sequences encoded by the clone inserts of the 
deposited clone pool, polypeptide sequences homologous thereto, or fragments of any of the 
preceding sequences. Homologous polypeptide sequences refer to a polypeptide sequence having at 
least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75% identity to one of the polypeptide 
sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242-272 and 274- 
384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full-length, signal 
peptides and mature polypeptide sequences encoded by the clone inserts of the deposited clone 
pool. Identity may be determined using any of the computer programs and parameters described 
herein, including FASTA with the default parameters or with any modified parameters. The 
homologous sequences may be obtained using any of the procedures described herein or may result 
from the correction of a sequencing error as described above. The polypeptide fragments comprise 
at least 5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, 200, 250, 300, 350, 400, 450 or 
500 consecutive amino acids of the polypeptides of SEQ ID Nos: 242-482. Preferably, the 
fragments include polypeptides encoded by the signal peptides included in SEQ ID Nos: 242-272 
and 274-384, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, polynucleotides 
described in Tables Va and in Table Vb, domains described in Table VI, epitopes described in 
Table VII, polypeptides described herein as having a biological activity, or fragments comprising at 
least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300 or 400 consecutive amino acids of the 
signal peptides included in SEQ ID Nos: 242-272 and 274-384, mature polypeptides included in 
SEQ ID Nos: 242-272 and 274-384, the polypeptides encoded by the polynucleotides described in 
Tables Va and in Table Vb, domains of Table VI, epitopes of Table VII or of polypeptides 
described herein as having a biological activity. It will be appreciated that the polypeptide codes of 
the SEQ ED Nos: 242-482 can be represented in the traditional single character format or three letter 
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format (See the inside back cover of Stryer, 1 995) or in any other format which relates the identity 
of the polypeptides in a sequence. 

It will be appreciated by those skilled in the art that the nucleic acid codes of the invention 
and polypeptide codes of the invention can be stored, recorded, and manipulated on any medium 
5 which can be read and accessed by a computer. As used herein, the words "recorded" and "stored" 
refer to a process for storing information on a computer medium. A skilled artisan can readily 
adopt any of the presently known methods for recording information on a computer readable 
medium to generate manufactures comprising one or more of the nucleic acid codes of the 
invention, or one or more of the polypeptide codes of the invention. Another aspect of the present 

10 invention is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 
50, 75, 100, 150 or 200 nucleic acid codes of the invention. Another aspect of the present invention 
is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 
1 50 or 200 polypeptide codes of the invention. 

Computer readable media include magnetically readable media, optically readable media, 

15 electronically readable media and magnetic/optical media. For example, the computer readable 
media may be a hard disk, a floppy disk, a magnetic tape, CD-ROM, Digital Versatile Disk (DVD), 
Random Access Memory (RAM), or Read Only Memory (ROM) as well as other types of other 
media known to those skilled in the art. 

Embodiments of the present invention include systems, particularly computer systems 

20 which store and manipulate the sequence information described herein. One example of a computer 
system 100 is illustrated in block diagram form in Figure 2. As used herein, "a computer system" 
refers to the hardware components, software components, and data storage components used to 
analyze the nucleotide sequences of the nucleic acid codes of the invention or the amino acid 
sequences of the polypeptide codes of the invention. In one embodiment, the computer system 100 

25 is a Sun Enterprise 1000 server (Sun Microsystems, Palo Alto, CA). The computer system 100 
preferably includes a processor for processing, accessing and manipulating the sequence data. The 
processor 105 can be any well-known type of central processing unit, such as the Pentium III from 
Intel Corporation, or similar processor from Sun, Motorola, Compaq or International Business 
Machines. 

30 Preferably, the computer system 100 is a general purpose system that comprises the 

processor 105 and one or more internal data storage components 110 for storing data, and one or 
more data retrieving devices for retrieving the data stored on the data storage components. A 
skilled artisan can readily appreciate that any one of the currently available computer systems are 
suitable. 

35 In one particular embodiment, the computer system 100 includes a processor 105 connected 

to a bus which is connected to a main memory 1 1 5 (preferably implemented as RAM) and one or 
more internal data storage devices 1 10, such as a hard drive and/or other computer readable media 
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having data recorded thereon. In some embodiments, the computer system 100 further includes one 
or more data retrieving device 1 1 8 for reading the data stored on the internal data storage devices 
110. 

The data retrieving device 1 1 8 may represent, for example, a floppy disk drive, a compact 
5 disk drive, a magnetic tape drive, etc. In some embodiments, the internal data storage device 1 10 is 
a removable computer readable medium such as a floppy disk, a compact disk, a magnetic tape, etc. 
containing control logic and/or data recorded thereon. The computer system 100 may 
advantageously include or be programmed by appropriate software for reading the control logic 
and/or the data from the data storage component once inserted in the data retrieving device. 
10 The computer system 100 includes a display 120 which is used to display output to a 

computer user. It should also be noted that the computer system 100 can be linked to other 
computer systems 125a-c in a network or wide area network to provide centralized access to the 
computer system 100. 

Software for accessing and processing the nucleotide sequences of the nucleic acid codes of 

15 the invention or the amino acid sequences of the polypeptide codes of the invention (such as search 
tools, compare tools, and modeling tools etc.) may reside in main memory 1 15 during execution. 

In some embodiments, the computer system 1 00 may further comprise a sequence comparer 
for comparing the above-described nucleic acid codes of the invention or the polypeptide codes of 
the invention stored on a computer readable medium to reference nucleotide or polypeptide 

20 sequences stored on a computer readable medium. A "sequence comparer" refers to one or more 
programs which are implemented on the computer system 100 to compare a nucleotide or 
polypeptide sequence with other nucleotide or polypeptide sequences and/or compounds including 
but not limited to peptides, peptidomimetics, and chemicals stored within the data storage means. 
For example, the sequence comparer may compare the nucleotide sequences of nucleic acid codes 

25 of the invention or the amino acid sequences of the polypeptide codes of the invention stored on a 
computer readable medium to reference sequences stored on a computer readable medium to 
identify homologies, motifs implicated in biological function, or structural motifs. The vanous 
sequence comparer programs identified elsewhere in this patent specification are particularly 
contemplated for use in this aspect of the invention. 

30 Figure 3 is a flow diagram illustrating one embodiment of a process 200 for comparing a 

new nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and the sequences in the database. The database of 
sequences can be a private database stored within the computer system 100, or a public database 
such as GENBANK, PIR OR SWISSPROT that is available through the Internet. 

35 The process 200 begins at a start state 201 and then moves to a state 202 wherein the new 

sequence to be compared is stored to a memory in a computer system 100. As discussed above, the 
memory could be any type of memory, including RAM or an internal storage device. 
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The process 200 then moves to a state 204 wherein a database of sequences is opened for 
analysis and comparison. The process 200 then moves to a state 206 wherein the first sequence 
stored in the database is read into a memory on the computer. A comparison is then performed at a 
state 210 to determine if the first sequence is the same as the second sequence. It is important to 
5 note that this step is not limited to performing an exact comparison between the new sequence and 
the first sequence in the database. Well-known methods are known to those of skill in the art for 
comparing two nucleotide or protein sequences, even if they are not identical. For example, gaps 
can be introduced into one sequence in order to raise the homology level between the two tested 
sequences. The parameters that control whether gaps or other features are introduced into a 

10 sequence during comparison are normally entered by the user of the computer system. 

Once a comparison of the two sequences has been performed at the state 210, a 
determination is made at a decision state 210 whether the two sequences are the same. Of course, 
the term "same" is not limited to sequences that are absolutely identical. Sequences that are within 
the homology parameters entered by the user will be marked as "same" in the process 200. 

15 If a determination is made that the two sequences are the same, the process 200 moves to a 

state 214 wherein the name of the sequence from the database is displayed to the user. This state 
notifies the user that the sequence with the displayed name fulfills the homology constraints that 
were entered. Once the name of the stored sequence is displayed to the user, the process 200 moves 
to a decision state 2 1 8 wherein a determination is made whether more sequences exist in the 

20 database. If no more sequences exist in the database, then the process 200 terminates at an end state 
220. However, if more sequences do exist in the database, then the process 200 moves to a state 
224 wherein a pointer is moved to the next sequence in the database so that it can be compared to 
the new sequence. In this manner, the new sequence is aligned and compared with every sequence 
in the database. 

25 It should be noted that if a determination had been made at the decision state 2 1 2 that the 

sequences were not homologous, then the process 200 would move immediately to the decision 
state 218 in order to determine if any other sequences were available in the database for 
comparison. 

Accordingly, one aspect of the present invention is a computer system comprising a 
30 processor, a data storage device having stored thereon a nucleic acid code of the invention or a 
polypeptide code of the invention,. In some embodiments the computer system further comprises a 
data storage device having retrievably stored thereon reference nucleotide sequences or polypeptide 
sequences to be compared to the nucleic acid code of the invention or polypeptide code of the 
invention and a sequence comparer for conducting the comparison. For example, the sequence 
35 comparer may comprise a computer program which indicates polymorphisms. In other aspects of 
the computer system, the system further comprises an identifier which identifies features in said 
sequence. The sequence comparer may indicate a homology level between the sequences compared 
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or identify motifs implicated in biological function and structural motifs in the nucleic acid code of 
the invention and polypeptide codes of the invention or it may identify structural motifs in 
sequences which are compared to these nucleic acid codes and polypeptide codes. In some 
embodiments, the data storage device may have stored thereon the sequences of at least 2, 5, 10, 15, 
5 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic acid codes of the invention or polypeptide codes 
of the invention. 

Another aspect of the present invention is a method for determining the level of homology 
between a nucleic acid code of the invention and a reference nucleotide sequence, comprising the 
steps of reading the nucleic acid code and the reference nucleotide sequence through the use of a 

1 0 computer program which determines homology levels and determining homology between the 

nucleic acid code and the reference nucleotide sequence with the computer program. The computer 
program may be any of a number of computer programs for determining homology levels, including 
those specifically enumerated herein, including BLAST2N with the default parameters or with any 
modified parameters. The method may be implemented using the computer systems described 

15 above. The method may also be performed by reading 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 
200 of the above described nucleic acid codes of the invention through the use of the computer 
program and determining homology between the nucleic acid codes and reference nucleotide 
sequences. 

Figure 4 is a flow diagram illustrating one embodiment of a process 250 in a computer for 
20 determining whether two sequences are homologous. The process 250 begins at a start state 252 and 
then moves to a state 254 wherein a first sequence to be compared is stored to a memory. The 
second sequence to be compared is then stored to a memory at a state 256. The process 250 then 
moves to a state 260 wherein the first character in the first sequence is read and then to a state 262 
wherein the first character of the second sequence is read. It should be understood that if the 
25 sequence is a nucleotide sequence, then the character would normally be either A, T, C, G or U. If 
the sequence is a protein sequence, then it should be in the single letter amino acid code so that the 
first and sequence sequences can be easily compared. 

A determination is then made at a decision state 264 whether the two characters are the 
same. If they are the same, then the process 250 moves to a state 268 wherein the next characters in 
30 the first and second sequences are read. A determination is then made whether the next characters 
are the same. If they are, then the process 250 continues this loop until two characters arc not the 
same. If a determination is made that the next two characters are not the same, the process 250 
moves to a decision state 274 to determine whether there are any more characters either sequence to 
read. 

35 If there aren't any more characters to read, then the process 250 moves to a state 276 

wherein the level of homology between the first and second sequences is displayed to the user. The 
level of homology is determined by calculating the proportion of characters between the sequences 
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that were the same out of the total number of sequences in the first sequence. Thus, if every 
character in a first 100 nucleotide sequence aligned with a every character in a second sequence, the 
homology level would be 100%. 

Alternatively, the computer program may be a computer program which compares the 
5 nucleotide sequences of the nucleic acid codes of the present invention, to reference nucleotide 
sequences in order to determine whether the nucleic acid code of the invention differs from a 
reference nucleic acid sequence at one or more positions. Optionally such a program records the 
length and identity of inserted, deleted or substituted nucleotides with respect to the sequence of 
either the reference polynucleotide or the nucleic acid code of the invention. In one embodiment, 

1 0 the computer program may be a program which determines whether the nucleotide sequences of the 
nucleic acid codes of the invention contain one or more single nucleotide polymorphisms (SNP) 
with respect to a reference nucleotide sequence. These single nucleotide polymorphisms may each 
comprise a single base substitution, insertion, or deletion. 

Another embodiment of the present invention is a method for comparing a first sequence to 

15 a reference sequence wherein the first sequence is selected from the group consisting of a cDNA 
code of SEQED NOs. 1 -297 and a polypeptide code of SEQ ID NOs. 298-594 comprising the steps 
of reading the first sequence and the reference sequence through use of a computer program which 
compares sequences and determining differences between the first sequence and the reference 
sequence with the computer program. In some aspects of this embodiment, said step of determining 

20 differences between the first sequence and the reference sequence comprises identifying 
polymorphisms. 

Another aspect of the present invention is a method for determining the level of homology 
between a polypeptide code of the invention and a reference polypeptide sequence, comprising the 
steps of reading the polypeptide code of the invention and the reference polypeptide sequence 

25 through use of a computer program which determines homology levels and determining homology 
between the polypeptide code and the reference polypeptide sequence using the computer program. 

Accordingly, another aspect of the present invention is a method for determining whether a 
nucleic acid code of the invention differs at one or more nucleotides from a reference nucleotide 
sequence compnsing the steps of reading the nucleic acid code and the reference nucleotide 

30 sequence through use of a computer program which identifies differences between nucleic acid 
sequences and identifying differences between the nucleic acid code and the reference nucleotide 
sequence with the computer program. In some embodiments, the computer program is a program 
which identifies single nucleotide polymorphisms The method may be implemented by the 
computer systems described above and the method illustrated in Figure 4. The method may also be 

35 performed by reading at least 2, 5, 10, 15,20,25,30,50,75, 100, 150 or 200 of the nucleic acid 
codes of the invention and the reference nucleotide sequences through the use of the computer 
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program and identifying differences between the nucleic acid codes and the reference nucleotide 
sequences with the computer program. 

Thus, another embodiment of the present invention is a method for comparing a first 
sequence to a reference sequence wherein the first sequence is selected from the group consisting of 
5 the nucleic acid codes of the present invention or the polypeptide codes of the present invention 
comprising the steps of reading the first sequence and the reference sequence through use of a 
computer program which compares sequences and determining differences between the first 
sequence and the reference sequence with the computer program. In some aspects of this 
embodiment, said step of determining differences between the first sequence and the reference 
10 sequence comprises identifying polymorphisms. 

Another aspect of the present invention is a method for determining the level of identity 
between a first sequence and a reference sequence, wherein the first sequence is selected from the 
group consisting of the nucleic acid codes of the present invention or the polypeptide codes of the 
present invention, comprising the steps of reading the first sequence and the reference sequence 
15 through the use of a computer program which determines identity levels and determining identity 
between the first sequence and the reference sequence with the computer program. 

In other embodiments the computer based system may further comprise an identifier for 
identifying features within the nucleotide sequences of the nucleic acid codes of the invention or the 
amino acid sequences of the polypeptide codes of the invention. An "identifier" refers to one or 
20 more programs which identifies certain features within the above-described nucleotide sequences of 
the nucleic acid codes of the invention or the amino acid sequences of the polypeptide codes of the 
invention. In one embodiment, the identifier may comprise a program which identifies an open 
reading frame in the cDNAs codes of the invention. 

Another embodiment of the present invention is a method for identifying a feature in a 
25 sequence selected from the group consisting of the nucleic acid codes of the invention or the amino 
acid sequences of the polypeptide codes of the invention comprising the steps of reading the 
sequence through the use of a computer program which identifies features in sequences and 
identifying features in the sequence with said computer program. In one aspect of this embodiment, 
the computer program comprises a computer program which identifies open reading frames. In a 
30 further embodiment, the computer program comprises a program that identifies linear or structural 
motifs in a polypeptide sequence. 

Figure 5 is a flow diagram illustrating one embodiment of an identifier process 300 for 
detecting the presence of a feature in a sequence. The process 300 begins at a start state 302 and 
then moves to a state 304 wherein a first sequence that is to be checked for features is stored to a 
35 memory 1 15 in the computer system 100. The process 300 then moves to a state 306 wherein a 
database of sequence features is opened. Such a database would include a list of each feature's 
attributes along with the name of the feature. For example, a feature name could be "Initiation 
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Codon" and the attribute would be "ATG". Another example would be the feature name 
"TAATAA Box" and the feature attribute would be "TAATAA". An example of such a database is 
produced by the University of Wisconsin Genetics Computer Group (www.gcg.com). 

Once the database of features is opened at the state 306, the process 300 moves to a state 
5 308 wherein the first feature is read from the database. A comparison of the attribute of the first 
feature with the first sequence is then made at a state 3 10. A determination is then made at a 
decision state 316 whether the attribute of the feature was found in the first sequence. If the 
attribute was found, then the process 300 moves to a state 3 1 8 wherein the name of the found 
feature is displayed to the user. 
10 The process 300 then moves to a decision state 320 wherein a determination is made 

whether move features exist in the database. If no more features do exist, then the process 300 
terminates at an end state 324. However, if more features do exist in the database, then the process 
300 reads the next sequence feature at a state 326 and loops back to the state 310 wherein the 
attribute of the next feature is compared against the first sequence. 
1 5 It should be noted, that if the feature attribute is not found in the first sequence at the 

decision state 316, the process 300 moves directly to the decision state 320 in order to determine if 
any more features exist in the database. 

In another embodiment, the identifier may comprise a molecular modeling program which 
determines the 3-dimensional structure of the polypeptides codes of the invention. Such programs 
20 may use any methods known to those skilled in the art including methods based on homology- 
modeling, fold recognition and ab initio methods as described in Sternberg et al., 1999, which 
disclosure is hereby incorporated by reference in its entirety. In some embodiments, the molecular 
modeling program identifies target sequences that are most compatible with profiles representing 
the structural environments of the residues in known three-dimensional protein structures. (See, 
25 e.g., Eisenberg et al. % U.S. Patent No. 5,436,850 issued July 25, 1995, which disclosure is hereby 
incorporated by reference in its entirety). In another technique, the known three-dimensional 
structures of proteins in a given family are superimposed to define the structurally conserved 
regions in that family. This protein modeling technique also uses the known three-dimensional 
structure of a homologous protein to approximate the structure of the polypeptide codes of the 
30 invention. (See e.g., Srinivasan, et a/., U.S. Patent No. 5,557,535 issued September 17, 1996, which 
disclosure is hereby incorporated by reference in its entirety). Conventional homology modeling 
techniques have been used routinely to build models of proteases and antibodies. (Sowdhamini et 
al., (1997)). Comparative approaches can also be used to develop three-dimensional protein models 
when the protein of interest has poor sequence identity to template proteins. In some cases, proteins 
35 fold into similar three-dimensional structures despite having very weak sequence identities. For 
example, the three-dimensional structures of a number of helical cytokines fold in similar three- 
dimensional topology in spite of weak sequence homology. 
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The recent development of threading methods now enables the identification of likely 
folding patterns in a number of situations where the structural relatedness between target and 
template(s) is not detectable at the sequence level. Hybrid methods, in which fold recognition is 
performed using Multiple Sequence Threading (MST), structural equivalencies are deduced from 
5 the threading output using a distance geometry program DRAGON to construct a low resolution 
model, and a full-atom representation is constructed using a molecular modeling package such as 
QUANTA. 

According to this 3 -step approach, candidate templates are first identified by using the 
novel fold recognition algorithm MST, which is capable of performing simultaneous threading of 

10 multiple aligned sequences onto one or more 3-D structures. In a second step, the structural 

equivalencies obtained from the MST output are converted into interresidue distance restraints and 
fed into the distance geometry program DRAGON, together with auxiliary information obtained 
from secondary structure predictions. The program combines the restraints in an unbiased manner 
and rapidly generates a large number of low resolution model confirmations. In a third step, these 

15 low resolution model confirmations are converted into full-atom models and subjected to energy 
minimization using the molecular modeling package QUANTA. (See e.g., Aszodi et aL, (1997)). 

The results of the molecular modeling analysis may then be used in rational drug design 
techniques to identify agents which modulate the activity of the polypeptide codes of the invention. 
Accordingly, another aspect of the present invention is a method of identifying a feature 

20 within the nucleic acid codes of the invention or the polypeptide codes of the invention comprising 
reading the nucleic acid code(s) or the polypeptide code(s) through the use of a computer program 
which identifies features therein and identifying features within the nucleic acid code(s) or 
polypeptide code(s) with the computer program. In one embodiment, computer program comprises 
a computer program which identifies open reading frames. In a further embodiment, the computer 

25 program identifies linear or structural motifs in a polypeptide sequence. In another embodiment, 
the computer program comprises a molecular modeling program. The method may be performed by 
reading a single sequence or at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic 
acid codes of the invention or the polypeptide codes of the invention through the use of the 
computer program and identifying features within the nucleic acid codes or polypeptide codes with 

30 the computer program, 

The nucleic acid codes of the invention or the polypeptide codes of the invention may be 
stored and manipulated in a variety of data processor programs in a variety of formats. For 
example, they may be stored as text in a word processing file, such as Microsoft WORD or 
WORDPERFECT or as an ASCII file in a variety of database programs familiar to those of skill in 

35 the art, such as DB2, SYBASE, or ORACLE. In addition, many computer programs and databases 
may be used as sequence comparers, identifiers, or sources of reference nucleotide or polypeptide 
sequences to be compared to the nucleic acid codes of the invention or the polypeptide codes of the 
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invention. The following list is intended not to limit the invention but to provide guidance to 
programs and databases which are useful with the nucleic acid codes of the invention or the 
polypeptide codes of the invention. The programs and databases which may be used include, but 
are not limited to: MacPattern (EMBL), DiscoveryBase (Molecular Applications Group), GeneMine 
5 (Molecular Applications Group), Look (Molecular Applications Group), MacLook (Molecular 
Applications Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX (Altschul et al r 1990), 
FASTA (Pearson and Lipman, 1988), FASTDB (Brutlag et aL, 1990), Catalyst (Molecular 
Simulations Inc.), Catalyst/SHAPE (Molecular Simulations Inc.), Cerius2.DBAccess (Molecular 
Simulations Inc.), HypoGen (Molecular Simulations Inc.), Insight II, (Molecular Simulations Inc.), 

10 Discover (Molecular Simulations Inc.), CHARMm (Molecular Simulations Inc.), Felix (Molecular 
Simulations Inc.), DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular Simulations Inc.), 
Homology (Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS (Molecular 
Simulations Inc.), Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular 
Simulations Inc.), WebLab Diversity Explorer (Molecular Simulations Inc.), Gene Explorer 

15 (Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), the EMBL/Swissprotein 

database, the MDL Available Chemicals Directory database, the MDL Drug Data Report data base, 
the Comprehensive Medicinal Chemistry database, Derwents's World Drug Index database, the 
BioByteMasterFile database, the Genbank database, and the Genseqn database. Many other 
programs and data bases would be apparent to one of skill in the art given the present disclosure. 

20 Motifs which may be detected using the above programs include sequences encoding 

leucine zippers, helix-turn-helix motifs, glycosylation sites, ubiquitination sites, alpha helices, and 
beta sheets, signal sequences encoding signal peptides which direct the secretion of the encoded 
proteins, sequences implicated in transcription regulation such as homeoboxes, acidic stretches, 
enzymatic active sites, substrate binding sites, and enzymatic cleavage sites. 

25 Conclusion 

As discussed above, the GENSET polynucleotides and polypeptides of the present 
invention or fragments thereof can be used for various purposes. The polynucleotides can be used 
to express recombinant protein for analysis, charactenzation or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either constitutively or at a 

30 particular stage of tissue differentiation or development or in disease states); as molecular weight 
markers on Southern gels; as chromosome markers or tags (when labeled) to identify chromosomes 
or to map related gene positions; as a reagent (including a labeled reagent) in assays designed to 
quantitatively determine levels of GENSET expression in biological samples; to compare with 
endogenous DNA sequences in patients to identify potential genetic disorders; as probes to 

35 hybridize and thus discover novel, related DNA sequences; as a source of information to derive 
PCR primers for genetic fingerprinting; for selecting and making oligomers for attachment to a 
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"gene chip" or other support, including for examination for expression patterns; to raise anti-protein 
antibodies using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
5 polynucleotide can also be used in interaction trap assays (such as, for example, that described in 
Gyuns et al., (1993) to identify polynucleotides encoding the other protein with which binding 
occurs or to identify inhibitors of the binding interaction. 

The proteins or polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for high-throughput 
10 screening; to raise antibodies or to elicit another immune response; as a reagent (including the 

labeled reagent) in assays designed to quantitatively determine levels of the protein (or its receptor) 
in biological fluids; as markers for tissues in which the corresponding protein is preferentially 
expressed (either constitutively or at a particular stage of tissue differentiation or development or in 
a disease state); and, of course, to isolate correlative receptors or ligands. Where the protein binds 
15 or potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
protein can be used to identify the other protein with which binding occurs or to identify inhibitors 
of the binding interaction. Proteins involved in these binding interactions can also be used to screen 
for peptide or small molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or kit 
20 format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning; A Laboratory 
Manual", 2d ed., Cole Spring Harbor Laboratory Press, Sambrook, J., E.F. Fritsch and T. Maniatis 
eds., 1989, and "Methods in Enzymology; Guide to Molecular Cloning Techniques", Academic 
25 Press, Berger and Kimmel eds., 1987, which disclosures are hereby incorporated by reference in 
their entireties. 

Polynucleotides and proteins of the present invention can also be used as nutritional sources 
or supplements. Such uses include without limitation use as a protein or ammo acid supplement, 
use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In such cases 

30 the protein or polynucleotide of the invention can be added to the feed of a particular organism or 
can be administered as a separate solid or liquid preparation, such as in the form of powder, pills, 
solutions, suspensions or capsules. In the case of microorganisms, the protein or polynucleotide of 
the invention can be added to the medium in or on which the microorganism is cultured. 

Although this invention has been described in terms of certain preferred embodiments, other 

35 embodiments which will be apparent to those of ordinary skill in the art in view of the disclosure 
herein are also within the scope of this invention. Accordingly, the scope of the invention is 
intended to be defined only by reference to the appended claims. 
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Preparation of Antibody Compositions to GENSET proteins 

Substantially pure protein or polypeptide is isolated from transfected or transformed cells 
containing an expression vector encoding a GENSET protein or a portion thereof. The 
5 concentration of protein in the final preparation is adjusted, for example, by concentration on an 
Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows: 

A. Monoclonal Antibody Production by Hybridorna Fusion 

Monoclonal antibody to epitopes in the GENSET protein or a portion thereof can be 
10 prepared from murine hybridomas according to the classical method of Kohler and Milstein, (1 975) 
or derivative methods thereof. Also see Harlow and Lane. (1988).. 

Briefly, a mouse is repetitively inoculated with a few micrograms of the GENSET protein 
or a portion thereof over a period of a few weeks. The mouse is then sacrificed, and the antibody 
producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol 
15 with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on 

selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and 
aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is 
continued. Antibody-producing clones are identified by detection of antibody in the supernatant 
fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, 
20 (1 980), which disclosure is hereby incorporated by reference in its entirety, and derivative methods 
thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested 
for use. Detailed procedures for monoclonal antibody production are described in Davis, et al. 
(1986) Section 21-2. 

B. Polyclonal Antibody Production by Immunization 

25 Polyclonal antiserum containing antibodies to heterogeneous epitopes in the GENSET 

protein or a portion thereof can be prepared by immunizing suitable non-human animal with the 
GENSET protein or a portion thereof, which can be unmodified or modified to enhance 
immunogenicity. A suitable non-human animal is preferably a non-human mammal is selected, 
usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crude preparation which has been 

30 enriched for GENSET concentration can be used to generate antibodies. Such proteins, fragments 

or preparations are introduced into the non-human mammal in the presence of an appropriate 

adjuvant (e.g. aluminum hydroxide, RIBI, etc.) which is known in the art. In addition the protein, 

fragment or preparation can be pretreated with an agent which will increase antigenicity, such 

agents are known in the art and include, for example, methylated bovine serum albumin (mBSA), 

35 bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH). 
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Scrum from the immunized animal is collected, treated and tested according to known procedures. 
If the serum contains polyclonal antibodies to undesired epitopes, the polyclonal antibodies can be 
purified by immunoaffinity chromatography. 

Effective polyclonal antibody production is affected by many factors related both to the 
5 antigen and the host species. Also, host animals vary in response to site of inoculations and dose, 
with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng 
level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques 
for producing and processing polyclonal antisera are known in the art. An effective immunization 
protocol for rabbits can be found in Vaitukaitis et al. ( 197 1), which disclosure is hereby 
10 incorporated by reference in its entirety. 

Booster injections can be given at regular intervals, and antiserum harvested when antibody 
titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar 
against known concentrations of the antigen, begins to fall. Sec, for example, Ouchterlony et aL, 
(1973), which disclosure is hereby incorporated by reference in its entirety. Plateau concentration 
15 of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12 uM). Affinity of the 
antisera for the antigen is determined by preparing competitive binding curves, as described, for 
example, by Fisher (1980), which disclosure is hereby incorporated by reference in its entirety. 

Antibody preparations prepared according to either the monoclonal or the polyclonal 
protocol are useful in quantitative immunoassays which determine concentrations of antigen- 
20 bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to 
identify the presence of antigen in a biological sample. The antibodies may also be used in 
therapeutic compositions for killing cells expressing the protein or reducing the levels of the protein 
in the body. 

Biological assays 

25 Assaying GENSET Secreted Proteins to Determine Whethe r they Bind to the Cell Surface 

The secreted proteins encoded by the GENSET cDNAs, preferably the proteins of SEQ ID 
NOs: 242-272 and 274-384, or fragments thereof are cloned into expression vectors. The proteins 
are purified by size, charge, immunochromatography or other techniques familiar to those skilled in 
the art. Following purification, the proteins are labeled using techniques known to those skilled in 

30 the art. The labeled proteins are incubated with cells or cell lines derived from a variety of organs 
or tissues to allow the proteins to bind to any receptor present on the cell surface. Following the 
incubation, the cells arc washed to remove non-specifically bound protein. The labeled proteins are 
detected by autoradiography. Alternatively, unlabeled proteins may be incubated with the cells and 
detected with antibodies having a detectable label, such as a fluorescent molecule, attached thereto. 

35 Specificity of cell surface binding may be analyzed by conducting a competition analysis in 

which various amounts of unlabeled protein are incubated along with the labeled protein. The 
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amount of labeled protein bound to the cell surface decreases as the amount of competitive 
unlabeled protein increases. As a control, various amounts of an unlabeled protein unrelated to the 
labeled protein is included in some binding reactions. The amount of labeled protein bound to the 
cell surface does not decrease in binding reactions containing increasing amounts of unrelated 
5 unlabeled protein, indicating that the protein encoded by the cDNA binds specifically to the cell 
surface. 

As discussed herein, secreted proteins have been shown to have a number of important 
physiological effects and, consequently, represent a valuable therapeutic resource. The secreted 
proteins encoded by the cDNAs or fragments thereof made using any of the methods described 
10 therein may be evaluated to determine their physiological activities as described below. 

Assaying GENSET proteins or Fragments Thereof for Cytokine, Cell Proliferation or Cell 
Differentiation Activity 

Secreted proteins may act as cytokines or may affect cellular proliferation or differentiation. 
Many protein factors discovered to date, including all known cytokines, have exhibited activity in 

15 one or more factor dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of a protein of the present invention is evidenced by 
any one of a number of routine factor dependent cell proliferation assays for cell lines including, 
without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, MC9/G, M+ (preB M+), 2E8, RB5, 
DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7c and CMK. The proteins encoded by the cDNAs of 

20 the invention or fragments thereof may be evaluated for their ability to regulate T cell or thymocyte 
proliferation in assays such as those described above or in the following references, which are 
incorporated herein by reference: Current Protocols in Immunology , Ed. by J.E. Coligan et al, 
Greene Publishing Associates and Wiley-Interscience; Takai et al, J. Immunol. 137:3494-3500, 
1986. Bertagnolli et al J. Immunol. 145:1706-1712, 1990. Bertagnolli et al., Cellular Immunology 

25 133:327-341, 1991. Bertagnolli, et al. J. Immunol. 149:3778-3783, 1992; Bowman et al., J. 
Immunol. 152:1756-1761, 1994. 

In addition, numerous assays for cytokine production and/or the proliferation of spleen 
cells, lymph node cells and thymocytes are known. These include the techniques disclosed in 
Current Protocols in Immunology . J.E. Coligan et al. Eds., Vol 1 pp. 3.12.1-3.12.14 John Wiley and 

30 Sons, Toronto. 1994; and Schreiber, R.D. Cur rent Protocols in I mmunolo gy., supra Vol 1 pp. 
6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be assayed for the ability to 
regulate the proliferation and differentiation of hematopoietic or lymphopoietic cells. Many assays 
for such activity are familiar to those skilled in the art, including the assays in the following 

35 references, which arc incorporated herein by reference: Bottomly, K., Davis, L.S. and Lipsky, P.E., 
Measurement of Human and Murine Interleukin 2 and Interleukin 4, Current Protocols i n 
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Immunology., J.E. Coligan et al. Eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
de Vnes et al. , J. Exp. Med. 173:1205-1211, 1 991 ; Moreau et ai, Nature 36:690-692, 1988; 
Greenbergcr et ai, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Nordan, R., Measurement of 
Mouse and Human Interleukin 6 Current Protocols in Immunology. J.E. Coligan et ai Eds. Vol 1 
5 pp. 6.6. 1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et ai, Proc. Natl. Acad. Sci. U.S.A. 
83:1857-1861, 1986; Bennett, F., Giannotti, J., Clark, S.C. and Turner, K.J., Measurement of 
Human Interleukin 1 1 Current Protocols in Immunology . J.E. Coligan et al. Eds. Vol 1 pp. 6.15.1 
John Wiley and Sons, Toronto. 1991; Ciarletta, A., Giannotti, J. } Clark, S.C. and Turner, K.J., 
Measurement of Mouse and Human Interleukin 9 Cur rent Protocols in I mmunology . J.E. Coligan et 

10 ai, Eds. Vol 1 pp. 6,13.1, John Wiley and Sons, Toronto. 1991. 

The proteins encoded by the cDNAs of the invention may also be assayed for their ability to 
regulate T-cell responses to antigens. Many assays for such activity are familiar to those skilled in 
the art, including the assays described in the following references, which are incorporated herein by 
reference: Chapter 3 (In vitro Assays for Mouse Lymphocyte Function), Chapter 6 (Cytokines and 

1 5 Their Cellular Receptors) and Chapter 7, (Immunologic Studies in Humans) in Current Protocols in 
Immunology , J.E. Coligan et ai Eds. Greene Publishing Associates and Wiley-Interscience; 
Weinberger etai,Proc. Nati Acad. Sci. USA 77:6091-6095, 1980; Weinberger et ai, Eur. J. 
Immun. 11:405-411, 1 98 1 ; Takai et ai , J. Immunol. 137:3494-3500, 1986; Takai et ai, J. Immunol. 
140:508-512, 1988. 

20 Those proteins which exhibit cytokine, cell proliferation, or cell differentiation activity may 

then be formulated as pharmaceuticals and used to treat clinical conditions in which induction of 
cell proliferation or differentiation is beneficial. Alternatively, as described in more detail below, 
genes encoding these proteins or nucleic acids regulating the expression of these proteins may be 
introduced into appropriate host cells to increase or decrease the expression of the proteins as 

25 desired. 

Assaying GENSET proteins or Fragments Thereof for Activity as Immune System Regulators 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 
as immune regulators. For example, the proteins may be evaluated for their activity to influence 
thymocyte or splenocyte cytotoxicity. Numerous assays for such activity are familiar to those 

30 skilled in the art including the assays described in the following references, which are incorporated 
herein by reference: Chapter 3 {In vitro Assays for Mouse Lymphocyte Function 3.1-3.19) and 
Chapter 7 (Immunologic studies in Humans) in Current Protocols in Immunolo gy, J.E. Coligan et 
al. Eds, Greene Publishing Associates and Wiley-Interscience; Herrmann et ai, Proc. Nati Acad. 
Sci. USA 78:2488-2492, 1981; Herrmann et ai, J. Immunol. 128:1968-1974, 1982; Handa et ai , J. 

35 Immunol. 135:1564-1572, 1985; Takai etai,J. Immunol. 137:3494-3500, 1986; Takai etai,J. 
Immunol. 140:508-512, 1988; Herrmann et ai, Proc. Nati Acad. Sci. USA 78:2488-2492, 1981; 
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Herrmann et al, J. Immunol 128:1968-1974, 1982; Handa et al, J Immunol. 135:1564-1572, 1985; 
Takai et al,J. Immunol 137:3494-3500, 1986; Bowman et al,J. Virology 61 : 1992-1998; Takai et 
al, J. Immunol 140:508-512, 1988; Bertagnolli et al, Cellular Immunology 133:327-341, 1991; 
Brown et al , J. Immunol 153:3079-3092, 1994. 
5 The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 

on T-cell dependent immunoglobulin responses and isotype switching. Numerous assays for such 
activity are familiar to those skilled in the art, including the assays disclosed in the following 
references, which are incorporated herein by reference: Maliszewski, J. Immunol 144:3028-3033, 
1990; Mond, J.J. and Brunswick, M Assays for B Cell Function: In vitro Antibody Production, Vol 
10 1 pp. 3.8.1-3.8,16 in Current Protocols in Immunology. J.E. Coligan et al Eds., John Wiley and 
Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 
on immune effector cells, including their effect on Thl cells and cytotoxic lymphocytes. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 

15 following references, which are incorporated herein by reference: Chapter 3 {In vitro Assays for 
Mouse Lymphocyte Function 3.1-3.19) and Chapter 7 (Immunologic Studies in Humans) in Current 
Protocols in Immunology , J.E. Coligan et al Eds., Greene Publishing Associates and Wiley- 
Interscience; Takai X Immunol 137:3494-3500, 1 986; Takai et al ; J. Immunol 140:508-512, 
1988; Bertagnolli et al, J Immunol 149:3778-3783, 1992. 

20 The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 

on dendritic cell mediated activation of naive T-cells. Numerous assays for such activity are 
familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Guery et al, J. Immunol 134:536-544, 1995; Inaba et al., 
Journal of Experimental Medicine 173:549-559, 1991; Macatonia et al, Journal of Immunology 

25 154:5071-5079, 1995; Porgador et al, Journal of Experimental Medicine 182:255-260, 1995; Nair 
et al, Journal of Virology 67:4062-4069, 1993; Huang et al, Science 264:961-965, 1994; 
Macatonia et al, Journal of Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al, Journal 
of Clinical Investigation 94:797-807, 1994; and Inaba et al, Journal of Experimental Medicine 
172:631-640, 1990. 

30 The proteins encoded by the cDNAs of the invention may also be evaluated for their 

influence on the lifetime of lymphocytes. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Darzynkiewicz et al, Cytometry 13:795-808, 1992; Gorczyca et al, Leukemia 
7:659-670, 1 993; Gorczyca et al, Cancer Research 53: 1945- 195 1, 1993; Itoh et al, Cell 66:233- 

35 243, 1991; Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al, Cytometry 
14:89 1-897, 1993; Gorczyca et al, International Journal of Oncology 1:639-648, 1992, 
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Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica er <z/., Blood 84:1 1 1-1 17, 1994; Fine et ai, 
Cellular immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al. t Proc. 
Nat. Acad Sci. USA 88:7548-7551, 1991. 
5 Those proteins which exhibit activity as immune system regulators activity may then be 

formulated as pharmaceuticals and used to treat clinical conditions in which regulation of immune 
activity is beneficial. For example, the protein may be useful in the treatment of various immune 
deficiencies and disorders (including severe combined immunodeficiency (SOD)), e.g., in 
regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as effecting 

10 the cytolytic activity of NK cells and other cell populations. These immune deficiencies may be 
genetic or be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
autoimmune disorders. More specifically, infectious diseases caused by viral, bacterial, fungal or 
other infection may be treatable using a protein of the present invention, including infections by 
HIV, hepatitis viruses, herpesviruses, mycobacteria, Leishmania spp., malaria spp. and various 

15 fungal infections such as candidiasis. Of course, in this regard, a protein of the present invention 
may also be useful where a boost to the immune system generally may be desirable, i.e., in the 
treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, autoimmune 
thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host disease and 
autoimmune inflammatory eye disease. Such a protein of the present invention may also to be 
useful in the treatment of allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is desired 

25 (including, for example, organ transplantation), may also be treatable using a protein of the present 
invention. 

Using the proteins of the invention it may also be possible to regulate immune responses, in 
a number of ways. Down regulation may be in the form of inhibiting or blocking an immune 
response already in progress or may involve preventing the induction of an immune response. The 

30 functions of activated T-cells may be inhibited by suppressing T cell responses or by inducing 

specific tolerance in T cells, or both. Immunosuppression of T cell responses is generally an active, 
non-antigen -specific, process which requires continuous exposure of the T cells to the suppressive 
agent. Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and persists after 

35 exposure to the tolerizing agent has ceased. Operationally, tolerance can be demonstrated by the 
lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent. 
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Down regulating or preventing one or more antigen functions (including without limitation 
B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high level lymphokine 
synthesis by activated T cells, will be useful in situations of tissue, skin and organ transplantation 
and in graft-versus-host disease (GVHD). For example, blockage of T cell function should result in 
5 reduced tissue destruction in tissue transplantation. Typically, in tissue transplants, rejection of the 
transplant is initiated through its recognition as foreign by T cells, followed by an immune reaction 
that destroys the transplant, The administration of a molecule which inhibits or blocks interaction 
of a B7 lymphocyte antigen with its natural ligand(s) on immune cells (such as a soluble, 
monomeric form of a peptide having B7-2 activity alone or in conjunction with a monomeric form 

10 of a peptide having an activity of another B lymphocyte antigen (e.g., B7- 1 , B7-3) or blocking 

antibody), prior to transplantation can lead to the binding of the molecule to the natural ligand(s) on 
the immune cells without transmitting the corresponding costimulatory signal. Blocking B 
lymphocyte antigen function in this matter prevents cytokine synthesis by immune cells, such as T 
cells, and thus acts as an immunosuppressant. Moreover, the lack of costimulation may also be 

15 sufficient to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term 
tolerance by B lymphocyte antigen -blocking reagents may avoid the necessity of repeated 
administration of these blocking reagents. To achieve sufficient immunosuppression or tolerance in 
a subject, it may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular blocking reagents in preventing organ transplant rejection or 

20 GVHD can be assessed using animal models that are predictive of efficacy in humans. Examples of 
appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic 
pancreatic islet cell grafts in mice, both of which have been used to examine the 
immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et aL, 
Science 257:789-792 (1992) and Turka et ai, Proc. Natl. Acad. Sci USA, 89: 1 1 102-1 1 105 (1992). 

25 In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven Press, 
New York, 1989, pp. 846-847) can be used to determine the effect of blocking B lymphocyte 
antigen function in vivo on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 

30 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block costimulation of T 
cells by disrupting receptor ligand interactions of B lymphocyte antigens can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-denved cytokines which may be 

35 involved in the disease process. Additionally, blocking reagents may induce antigen -specific 

tolerance of autoreactive T cells which could lead to long-term relief from the disease. The efficacy 
of blocking reagents in preventing or alleviating autoimmune disorders can be determined using a 
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number of well-characterized animal models of human autoimmune diseases. Examples include 
murine experimental autoimmune encephalitis, systemic lupus erythmatosis in MRL/pr/pr mice or 
NZB hybrid mice, murine autoimmuno collagen arthritis, diabetes mellitus in OD mice and BB rats, 
and murine experimental myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, 
5 New York, 1989, pp. 840-856). 

Upregulation of an antigen function (preferably a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response through stimulating B lymphocyte 
10 antigen function may be useful in cases of viral infection. In addition, systemic viral diseases such 
as influenza, the common cold, and encephalitis might be alleviated by the administration of 
stimulatory form of B lymphocyte antigens systemically. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed APCs 
1 5 either expressing a peptide of the present invention or together with a stimulatory form of a soluble 
peptide of the present invention and reintroducing the in vitro activated T cells into the patient. The 
infected cells would now be capable of delivering a costimulatory signal to T cells in vivo, thereby 
activating the T cells. 

In another application, up regulation or enhancement of antigen function (preferably B 

20 lymphocyte antigen function) may be useful in the induction of tumor immunity. Tumor cells (e.g., 
sarcoma, melanoma, lymphoma, leukemia, neuroblastoma, carcinoma) transfected with a nucleic 
acid encoding at least one peptide of the present invention can be administered to a subject to 
overcome tumor-specific tolerance in the subject. If desired, the tumor cell can be transfected to 
express a combination of peptides. For example, tumor cells obtained from a patient can be 

25 transfected ex vivo with an expression vector directing the expression of a peptide having B7-2-like 
activity alone, or in conjunction with a peptide having B7-1 -like activity and/or B7-3-like activity. 
The transfected tumor cells are returned to the patient to result in expression of the peptides on the 
surface of the transfected cell. Alternatively, gene therapy techniques can be used to target a tumor 
cell for transfection in vivo. 

30 The presence of the peptide of the present invention having the activity of a B lymphocyte 

antigen(s) on the surface of the tumor cell provides the necessary costimufation signal to T cells to 
induce a T cell mediated immune response against the transfected tumor cells. In addition, tumor 
cells which lack MHC class I or MHC class II molecules, or which fail to reexpress sufficient 
amounts of MHC class I or MHC class II molecules, can be transfected with nucleic acids encoding 

35 all or a fragment of (e.g., a cytoplasmic-domain truncated fragment) of an MHC class I a chain 
protein and p 2 microglobulin protein or an MHC class II a chain protein and an MHC class II P 
chain protein to thereby express MHC class I or MHC class II proteins on the cell surface. 
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Expression of the appropriate class II or class II MHC in conjunction with a peptide having the 
activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T cell mediated immune 
response against the transfected tumor cell. Optionally, a gene encoding an anti sense construct 
which blocks expression of an MHC class II associated protein, such as the invariant chain,can also 
5 be cotransfected with a DNA encoding a peptide having the activity of a B lymphocyte antigen to 
promote presentation of tumor associated antigens and induce tumor specific immunity. Thus, the 
induction of a T cell mediated immune response in a human subject may be sufficient to overcome 
rumor-specific tolerance in the subject. Alternatively, as described in more detail below, genes 
encoding these proteins or nucleic acids regulating the expression of these proteins may be 
10 introduced into appropriate host cells to increase or decrease the expression of the proteins as 
desired. 

Assaying GENSET proteins or Fragments Thereof for Hematopoiesis Regulating Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their hematopoiesis regulating activity. For example, the effect of the proteins on 
1 5 embryonic stem cell differentiation may be evaluated. Numerous assays for such activity are 

familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et 
aL, Molecular and Cellular Biology 13:473-486, 1993; McClanahan etai, Blood 8 1:2903 -29 15, 
1993. 

20 The proteins encoded by the cDNAs of the invention or fragments thereof may also be 

evaluated for their influence on the lifetime of stem cells and stem cell differentiation. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 
following references, which are incorporated herein by reference: Freshney, M.G. Methylcellulose 
Colony Forming Assays, in Culture of Hematopoietic Cells . R.I. Freshney, et al Eds. pp. 265-268, 

25 Wiley-Liss, Inc., New York, NY. 1994; Hirayama et aL, Proc. Natl Acad. Sci. USA 89:5907-591 1 , 
1992; McNiece, I.K, and Briddell, R.A. Primitive Hematopoietic Colony Forming Cells with High 
Proliferative Potential, in Culture of Hematopoietic Cells . R.I. Freshney, et al. eds. Vol pp. 23-39, 
Wiley-Liss, Inc., New York, NY. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Ploemacher, R.E. Cobblestone Area Forming Cell Assay, In Cul t ure of Hematopoietic Cells. R.I. 

30 Freshney, et al. Eds. pp. 1-21, Wiley-Liss, Inc., New York, NY. 1994; Spooncer, E., Dexter, M. and 
Allen, T. Long Term Bone Marrow Cultures in the Presence of Stromal Cells, in Culture of 
Hematopoietic Cells . R.I. Freshney, et al. Eds. pp. 163-179, Wiley-Liss, Inc., New York, NY. 1994; 
and Sutherland, H.J. Long Term Culture Initiating Cell Assay, in Culture of Hematopoietic Cells . 
R.I. Freshney, et al. Eds, pp. 139-162, Wiley-Liss, Inc., New York, NY. 1994. 

35 Those proteins which exhibit hematopoiesis regulatory activity may then be formulated as 

pharmaceuticals and used to treat clinical conditions in which regulation of hematopoeisis is 
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beneficial. For example, a protein of the present invention may be useful in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell deficiencies. Even 
marginal biological activity in support of colony forming cells or of factor-dependent cell lines 
indicates involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
5 erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy to 
stimulate the production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., traditional 
CSF activity) useful, for example, in conjunction with chemotherapy to prevent or treat consequent 

10 myelo-suppression; in supporting the growth and proliferation of megakaryocytes and consequently 
of platelets thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet transfusions; 
and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of 
maturing to any and all of the above-mentioned hematopoietic cells and therefore find therapeutic 

1 5 utility in various stem cell disorders (such as those usually treated with transplant on, including, 
without limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or ex-vivo 
(i.e., in conjunction with bone marrow transplantation or with peripheral progenitor cell 
transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene 

20 therapy. Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Tissue Growth 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
25 evaluated for their effect on tissue growth. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in International Patent Publication No. 
WO95/16035, International Patent Publication No. WO95/05846 and International Patent 
Publication No. WO91/07491, which are incorporated herein by reference. 

Assays for wound healing activity include, without limitation, those described in: Winter, 
30 Epidermal Wound Healing , pps. 71-1 12 (Maibach, HI and Rovee, DT, eds.), Year Book Medical 
Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 71:382-84 (1978) 
which are incorporated herein by reference. 

Those proteins which are involved in the regulation of tissue growth may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of tissue 
35 growth is beneficial. For example, a protein of the present invention also may have utility in 

compositions used for bone, cartilage, tendon, ligament and/or nerve tissue growth or regeneration, 
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as well as for wound healing and tissue repair and replacement, and in the treatment of burns, 
incisions and ulcers. 

A protein of the present invention, which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone fractures 
5 and cartilage damage or defects in humans and other animals. Such a preparation employing a 
protein of the invention may have prophylactic use in closed as well as open fracture reduction and 
also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic 
agent contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

10 A protein of this invention may also be used in the treatment of periodontal disease, and in 

other tooth repair processes. Such agents may provide an environment to attract bone-forming 
cells, stimulate growth of bone-forming cells or induce differentiation of progenitors of bone- 
forming cells. A protein of the invention may also be useful in the treatment of osteoporosis or 
osteoarthritis, such as through stimulation of bone and/or cartilage repair or by blocking 

1 5 inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes. 

Another category of tissue regeneration activity that may be attributable to the protein of the 
present invention is tendon/ligament formation. A protein of the present invention, which induces 
tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not 

20 normally formed, has application in the heating of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to 
tendon or ligament tissue, as well as use in the improved fixation of tendon or ligament to bone or 
other tissues, and in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like 

25 tissue formation induced by a composition of the present invention contributes to the repair of 

congenital, trauma induced, or other tendon or ligament defects of other origin, and is also useful in 
cosmetic plastic surgery for attachment or repair of tendons or ligaments. The compositions of the 
present invention may provide an environment to attract tendon- or ligament-forming cells, 
stimulate growth of tendon- or ligament-forming cells, induce differentiation of progenitors of 

30 tendon- or ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

35 The protein of the present invention may also be useful for proliferation of neural cells and 

for regeneration of nerve and brain tissue, i.e., for the treatment of central and peripheral nervous 
system diseases and neuropathies, as well as mechanical and traumatic disorders, which involve 
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degeneration, death or trauma to neural cells or nerve tissue. More specifically, a protein may be 
used in the treatment of diseases of the peripheral nervous system, such as peripheral nerve injuries, 
peripheral neuropathy and localized neuropathies, and central nervous system diseases, such as 
Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
5 Drager syndrome. Further conditions which may be treated in accordance with the present 

invention include mechanical and traumatic disorders, such as spinal cord disorders, head trauma 
and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from chemotherapy 
or other medical therapies may also be treatable using a protein of the invention. 

Proteins of the invention may also be useful to promote better or faster closure of non- 
10 healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

It is expected that a protein of the present invention may also exhibit activity for generation 
or regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium) muscle (smooth, skeletal or cardiac) and vascular (including vascular 
15 endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring to allow normal tissue to 
generate. A protein of the invention may also exhibit angiogenic activity. 

A protein of the present invention may also be useful for gut protection or regeneration and 
treatment of lung or liver fibrosis, reperfusion injury in various tissues, and conditions resulting 
20 from systemic cytokine damage. 

A protein of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
25 acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Reproductive Hormones or 
Cell Movement 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
30 evaluated for their ability to regulate reproductive hormones, such as follicle stimulating hormone. 
Numerous assays for such activity are familiar to those skilled in the art, including the assays 
disclosed in the following references, which are incorporated herein by reference: Vale et ai, 
Endocrinology 91:562-572, 1972; Ling et ai, Nature 321 :779-782, 1986; Vale et ai, Nature 
321:776-779, 1986; Mason etai, Nature 318:659-663, 1985; Forage et ai , Proc. Natl. Acad. Set. 
35 OS/1 83:3091-3095, 1986. Chapter 6. 12 (Measurement of Alpha and Beta Chemokines) Current 
Protocol s i n Immunolog y. J.E. Coligan et ai Eds. Greene Publishing Associates and Wiley- 
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Intersciece ; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al APMIS 103:140-146, 1995; 
Mullere/a/. Eur. J. Immunol. 25:1744-1748; Gruber et al J. of Immunol. 152:5860-5867, 1994; 
Johnston*?/ al, J. of Immunol 153:1762-1768, 1994. 

Those proteins which exhibit activity as reproductive hormones or regulators of cell 
5 movement may then be formulated as pharmaceuticals and used to treat clinical conditions in which 
regulation of reproductive hormones or cell movement are beneficial. For example, a protein of the 
present invention may also exhibit activin- or inhibin-related activities. Inhibins are characterized 
by their ability to inhibit the release of follicle stimulating hormone (FSH), while activins are 
characterized by their ability to stimulate the release of folic stimulating hormone (FSH). Thus, a 

10 protein of the present invention, alone or in heterodimers with a member of the inhibin a family, 
may be useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient amounts of 
other inhibins can induce infertility in these mammals. Alternatively, the protein of the invention, 
as a homodimer or as a heterodimer with other protein subunits of the inhibin-B group, may be 

15 useful as a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating 
FSH release from cells of the anterior pituitary. See, for example, United States Patent 4,798,885, 
the disclosure of which is incorporated herein by reference. A protein of the invention may also be 
useful for advancement of the onset of fertility in sexually immature mammals, so as to increase the 
lifetime reproductive performance of domestic animals such as cows, sheep and pigs. 

20 Alternatively, as described in more detail below, genes encoding these proteins or nucleic 

acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Chemotactic/Chemokinetic Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 

25 evaluated for chemotactic/chemokinetic activity. For example, a protein of the present invention 
may have chemotactic or chemokinetic activity (e.g., act as a chemokine) for mammalian cells, 
including, for example, monocytes, fibroblasts, neutrophils, T-ceils, mast cells, eosinophils, 
epithelial and/or endothelial cells. Chemotactic and chmokinetic proteins can be used to mobilize 
or attract a desired cell population to a desired site of action. Chemotactic or chemokinetic proteins 

30 provide particular advantages in treatment of wounds and other trauma to tissues, as well as in 

treatment of localized infections. For example, attraction of lymphocytes, monocytes or neutrophils 
to tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
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Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhension of one cell population 
to another cell population. Suitable assays for movement and adhesion include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J.E. Coligan, A.M. Kruisbeek, D.H. 
10 Margulies, E.M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience 
(Chapter 6.12, Measurement of alpha and beta Chemokincs 6.12.1-6.12.28; Taub et ai J. Clin. 
Invest. 95:1370-1376, 1995; Undetai APMIS 103:140-146, 1995; Mueller et al Eur. J. Immunol. 
25:1744-1748; Gruber et ai J. of Immunol. 152:5860-5867, 1994; Johnston et ai J. of Immunol, 
153:1762-1768, 1994. 

15 Assayi ng GENSET proteins or Fragments Thereof for Regulation of Blood Clotting 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their effects on blood clotting. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Linet et ai , J. Clin. Pharmacol. 26:131-140, 1986; Burdick et ai, Thrombosis 
20 Res. 45:413-419, 1987; Humphrey et ai, Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

Those proteins which are involved in the regulation of blood clotting may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of blood 
clotting is beneficial. For example, a protein of the invention may also exhibit hemostatic or 

25 thrombolytic activity. As a result, such a protein is expected to be useful in treatment of various 
coagulations disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, surgery or other 
causes. A protein of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 

30 example, infarction of cardiac and central nervous system vessels (e.g., stroke)). Alternatively, as 
described in more detail below, genes encoding these proteins or nucleic acids regulating the 
expression of these proteins may be introduced into appropriate host cells to increase or decrease 
the expression of the proteins as desired. 

Assaying GENS E T proteins or Fragme n ts Thereof f or Involvement in Receptor/Li gand Interactions 
35 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for their 

involvement in receptor/ligand interactions. Numerous assays for such involvement are familiar to 
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those skilled in the art, including the assays disclosed in the following references, which are 
incorporated herein by reference; Chapter 7.28 (Measurement of Cellular Adhesion under Static 
Conditions 7.28.1-7.28.22) in Current Protocols in Immunology . J.E. Coligan et al. Eds. Greene 
Publishing Associates and Wiley-Interscience; Takai et ai, Proc. Natl. Acad. Sci. USA 84:6864- 
5 6868, 1987; Bicrer et al., J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 
169:149-160, 1989; Stoltenborg et al, J. Immunol. Methods 175:59-68, 1994; Stilts a/., Cell 
80:661-670, 1995; Gyuris et aL, Cell 75:791 -803, 1993. 

For example, the proteins of the present invention may also demonstrate activity as 
receptors, receptor ligands or inhibitors or agonists of receptor/Hgand interactions. Examples of 

10 such receptors and ligands include, without limitation, cytokine receptors and their ligands, receptor 
kinases and their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell 
interactions and their ligands (including without limitation, cellular adhesion molecules (such as 
selectins, integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, 
antigen recognition and development of cellular and humoral immune respones). Receptors and 

15 ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

Assaying GENSET proteins or Fragments Thereof for Anti -Inflammatory Activity 
20 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for anti- 

inflammatory activity. The anti-inflammatory activity may be achieved by providing a stimulus to 
cells involved in the inflammatory response, by inhibiting or promoting cell-cell interactions (such 
as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or suppressing 
25 production of other factors which more directly inhibit or promote an inflammatory response. 

Proteins exhibiting such activities can be used to treat inflammatory conditions including chronic or 
acute conditions), including without limitation inflammation associated with infection (such as 
septic shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusioninury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, 
30 nephritis, cytokine or chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease 
or resulting from over production of cytokines such as TNF or IL-1 . Proteins of the invention may 
also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material 

Assaying GENSET proteins or Fragments Thereof for Tumor Inhibition Activity 

The proteins encoded by the cDNAs of the invention or a fragment thereof may also be 
35 evaluated for tumor inhibition activity. In addition to the activities described above for 

immunological treatment or prevention of tumors, a protein of the invention may exhibit other anti- 
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tumor activities. A protein may inhibit tumor growth directly or indirectly (such as, for example, 
via ADCC). A protein may exhibit its tumor inhibitory activity by acting on tumor tissue or tumor 
precursor tissue, by inhibiting formation of tissues necessary to support tumor growth (such as, for 
example, by inhibiting angiogenesis), by causing production of other factors, agents or cell types 
5 which inhibit tumor growth, or by suppressing, eliminating or inhibiting factors, agents or cell types 
which promote tumor growth, 

A protein of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or 

10 enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such 
as, for example, breast augmentation or diminution, change in bone form or shape); effecting 
biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; 
effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 

15 dietary fat, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional factors or 
component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, 
stress, cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages; 
20 hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and 
treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, 
psoriasis); immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an immune 
response against such protein or another material or entity which is cross-reactive with such protein. 
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1 88-28-4-0-B 1 2-CS.cor 


PRT 


nRluescrintTT <ZK 


312 


188-28-4-0-B12-CS.fr 


PRT 


nBluescrintTT STC- 


313 


188-28-4-0-D4-CS 


PRT 


DBluescrintTT SK- 


314 


188-41- l-0-B8-CS.cor 


PRT 


nBluescrintTT SK- 


1 315 


188-41-l-0-B8-CS.fr 


PRT 


nBluescrintTl SK- 


316 


1 88-45- 1-0-D9-CS 


PRT 


nBluescrintTT SK- 


317 


188-9-2-0-E1-CS 


PRT 


nBluescrintTT SK- 


318 


105-079-3-0-A11-CS 


PRT 


nBluescrintTT SK- 

p xj lu^on i p 1 1 1 o rv 


319 


105-092-1-0-H7-CS 


PRT 


nBluescrintTT SK- 


320 


105-141 -4-0-H9-CS 


I PRT 


nBluescrintTT SK- 

p.L*I UtoVvl ipill O FV 


321 


109-013-1-0-B9-CS 


PRT 


nBluescrintTT SK- 


322 


110-008-4-0-D9-CS 


PRT 


nRluescrin+TT SK 


323 


1 14-001 -3-0-A2-CS 


PRT 


nRluescrintTT SK- 
• utoti ipiii OIV- 


324 


1 14-028-2-0-C1-CS 


PRT 


nR ii if*crrir»tTT SK 


325 


1 14-032-1-0-HlO-CS 


PRT 


nRliiPQrrinrTT Slf 


326 


1 14-043 -2-0- A 10-CS 


PRT 


nRln^crrintTT SK 


327 


11 4-044- 1-0-C5-CS 


PRT 


nRlnesrrintTT SK 


328 


11 6-003-3 -0-D 10-CS 


PRT 


nRlneQrrintH SK 


329 


1 16-003-3-0-G12-CS 


PRT 


nRliiAc^rirttTT 

poiutbcnpui oiv- 


330 


116-01 1-2 -0-F 11 -CS 


PRT 


nR1iie<jrrintTT SK 


331 


116-033-3-0-E4-CS 


PRT 


nRhiescrintTT SK- 


332 


1 16-041-4-0-B6-CS 


PRT 


riR1iif»cr*rinfTT QkT 

pr>iuei>cnpLii ojv- 


333 


116-044-2-0-C4-CS 


PRT 


r»Rliif*<;r»riotTT SK~ 


334 


1 16-075-1 -0-E6-CS 


PRT 


nRhif»<=rnrvtTT SK~ 


335 


116-094-4-0-G5-CS 


PRT 


nRliifurrirvtTT kf 


336 


117-005-3-0-F2-CS 


PRT 


nR1iif»<:rrir>tTT SK- 
poi ucoti ipill oiv- 


337 


121-007-3-0-D9-CS 


PRT 


nRluc^crintTT SK- 

}J*-J lUv dCI ip 111 i3JV 


338 


145 -9 1-3 -0-D 10-CS 


PRT 


nR1iif»cr*rintTT SkT 


339 


157-17-1-0-F4-CS 


PRT 


nRlnpQrrintTT SK 


340 


160-11-3-0-G8-CS 


PRT 


nRlnesrrintTT SK 
puiutoci ipi.ii orv - 


341 


160-24-1 -0-F 12-CS 


PRT 


nRlnp<;rrmtTT SK- 


342 


160-24-2-0-E9-CS 


PRT 


nBluescrintTT SK- 


343 


160-25-4-0-D2-CS 


PRT 


nRlne^rrintTI SK - 
poiucbv/iipiii o rv 


344 


160-31-3-0-A11-CS 


PRT 


nBluescrintTT SK- 
pi_> i tit- 2>L- 1 1 p 1 1 1 o rv 


345 


160-32-1 -0-F6-CS 


PRT 


nBluescrintTT SK- 


346 


160-37*l-0-A3-CS 


PRT 


nBluescrintTT SK- 
p.t-'iLi^a^i ipiii o rv 


347 


160-40-3-0-E9-CS 


PRT 


nBluescrintTT SK- i 


348 


160-58-3-0-E4-CS 


PRT 


nBluescrintTT SK- 


349 


160-85-3-0-D4-CS 


PRT 


nBluescrintTT SK- 

pu i uv^okxi i p m oxv— 


350 


160-95-3-0-A1 1-CS 


PRT 


nBluescrintTT SK- 


351 


162-10-4-0-F9-CS.cor 


PRT 


nBluescrintTT SK- 

pu 1 liwoL' I ip 111 O JTV 


1 352 


162-10-4-0-F9-CS.fr 


PRT 


pBluescnptll SK- ! 


353 


174-13-2-0-E4-CS 


PRT 


pPT 


354 


174-46-2-0-BI1-CS 


PRT 


pPT 


355 


179-8-2-0-A6-CS 


PRT 


pBluescnptll SK- 


356 


1 80-22-3 -0-B6-CS 


PRT 


pBluescnptll SK- 


357 


181-13-1-0-F7-CS 


PRT 


pBluescnptll SK- 


358 


181-1 5-4-0-F7-CS 


PRT 


pBluescriptll SK- 


359 


1 81-20-1 -0-G7-CS 


PRT 


pBluescnptll SK- 


360 


184-15-3-0-D1-CS 


PRT 


pBluescnptll SK- 
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16 1 
Jol 


187-12-2-0-G1 1-CS 


PR1 


pBluescnptll SK- 


^A7 


1 0"7 1 1 A A 1 1 P C 


PR 1 


pBluescnptll SK- 


161 

JOJ 


1 lo /-3u-u-U-k23-CS 


rRl 


pBluescnptll SK- 


^A4 


107 1A A A alQ PC 

i o /oo-u-u-e iy-Co 


DDT" 


poluescriptll SK- 




1 57 15 A A /lOI PC 

1 o/-Jo^U-U-azz-C S 


rK 1 


pBluescnptll SK- 


^AA 


1 07 TO A A UQ PC 

i o /-3y-u-u-Dy-Co 


DO T 1 

rK 1 


..1)1,. v ... .. 1 T T pT/ 

pBluescnptll SK- 


^A7 


1 07 TO A A PC 

io /-JV-U-U-gO-L-o 


DDT 

rK 1 


pBluescnptll SK- 




1 0.7 A A no pC 
1 o / -4j -U-U-l 1 o-to 


rK 1 


poiuescnptll SK- 


joy 


i on ac a A -v-,1 1 PC 


rK 1 


pBluescriptll SK- 


J) /u 


107 /I C A A -»0 PC 

io /-4_> -U-U-no-Co 


PR 1 


pBluescnptll SK- 


^7 1 


1 0"7 A A P11 PC 

i o /-4o-U-U-tz3-Co 


PR 1 


pBluescnptll SK- 


! in 


i o"7 c i a a 1 1 rc 
lo /-5- 1 -U-A 12-Co 


PR1 


pBluescnptll SK- 


in 

\ J 1 5 


1 07 C 1 A T26 PC 

lo /0'l-U'ro*Lo 


PR1 


pBluescnptll SK- 


J /4 


1 0 "7 CIA Q1 PC 

1 o / O-2-U-rJz-Co 


nn i" 
PR 1 


T> 1 . _ . ,TT O IT 

pBluescnptll SK- 


11 c 


187-5-3-0-D5-CS 


PRT 


pBluescriptll SK- 


116 

J /o 


1 O *7 C 1 A A 4X\ 0 

187-5 I -0-0-ry-CS 


PRT 


J~\ 1 _ ,TT f>T>T 

pBluescnptll SK- 


1 77 

J / / 


1 07 £ 1 A DA PC 

1 o /-o-l -U-By-Co 


PR1 


T>1. iTT 017" 

pBluescnptll SK- 


1*70 

J /O 


1 0*7 £. A A P 1 A PC 

lo/-o--4-U-Clu-CS 


PR1 


I -» 1 _TT CI/' 

pBluescnptll SK- 


J /y 


loo-ly-2-U -Co-Co 


PRT 


pBluescnptll SK- 


JoU 


1 88-22-4-0-O6-CS 


PRT 


pBluescnptll SK- | 


1Q 1 
Jo 1 


1 OO IO A A TM 1 PC 

1 50-28-4-0-01 1-CS 


PRT 


III ^tt oir 

pBluescnptll SK- 


Joz 


TOO 1A 1 A r 1 A PC 

1 oo-zy- 1 -0-b 10-CS 


PRT 


pBluescnptll SK- 


JoJ 


lOO 1 A A A T C O 

1 88-34-4-0-E5-CS 


PRT 


1 _TT PT/ 

pBluescnptll SK- 


1 O A 

384 


1 O O A T A AC y^O 

188-9-3-0-A5-CS 


PRT 


pBluescriptll SK- 


IOC 

385 


1 AC A1 1 1 A /"''I 

1 05-02 1 -3-0-C3-CS 


PRT 


pBluescriptll SK- 


1 Q6 

Joo 


1 AC A1"7 >l A Till po 

1 U5-03 / -4-0-H 1 2-CS 


PRT 


"P> t __TT P* T^T 

pBluescnptll SK- 


TOT 

3o / 


1 05 -073 -2-0-A7-CS 


PRT 


pBluescnptll SK- ; 


1 Q 0 

Joo 


1 AA AA1 A A P£ 

1 uy-uU2-4-0-Co-CS 


PRT 


pBluescnptll SK- 


joy 


1 AA AA"2 1 A P /I PC 


PR 1 


Tl 1 _ _ ,TT O T^ 1 

pBluescnptll SK- 


Q OA 

jyu 


1 1/ IIO /1A AO PO 

116-1 lo-4-0-A8-CS 


PRT 


pBluescnptll SK- 


jy i 


1 A C CI 1 A T"\ 1 1 PC 

145 -52-2 ~0-D12-Cb 


PRT 


r~» 1 jit r r 

pBluescnptll SK- 


1Q1 
jyz 


1 A c "7 1 A PC PC 

1 4 J - / -z -U-05 o 


PR1 


pBluescnptll SK- 


1Q1 

jyj 


1 >if 1 1 A T^il PC 

14j- /- J-U-lJ J-C ,o 


PR1 


r» 1 iTT r~i Ty 

pBluescnptll SK- 


jy4 


1 C "7 1 "7 1 A PI PC 

15 /-17-2-U-C1 -Cb 


PRT 


pBluescnptll SK- 


7QC 

jyj 


1 /CA 1 A 1 1 A IT 1 PC 

1 ou- 1U 1 -3-U-riz-Cb 


PRT 


Tl 1 ^ iTT O T/ 

pBluescnptll SK- 


jyo 


1 /TA 11 1 A niA PC 

1 OU- 1 z~ 1 ~\J-IJ 1 U-Lo 


PR1 


111.. ^ * iTT cr/ 

pBluescnptll SK- 


JV / 


1 7 0 A A P/l PC 

1 OU-ZO-4-U-L.4-C ^> 


T>T>T' 
PR 1 


D1 ^ ' 4 TT HT/ 

pBluescnptll SK- 


jyo 


1 OU-J 1 -j'U-c4-tC) 


DDT 

PK1 


~« H 1 . . ^ n 1 ill C TS ! 

pBluescnptll SK- 




i /^n /in i n uyi pc 


DDT 

rK 1 


pBluescnptll SK- 


400 

^ V V 


i ao *\A i n P7 r 1 ^ 


r K i 


poluescriptll on- 


40 1 


1 ao oo i n a o r c ^/-. r 
i ou-oo-j -u-/vo-^o.cor 


r K 1 


poiuescnptii blv- 


407 


i ah os i n ao rc fr 

1 OU-oo-J-U-A.o-Co.ir 


r K 1 


phJiuescriptu SK- 


40^ 


1 AO QQ_4 0 PM 

iou-yy-4-u-c,4-co 


DDT 

r K 1 


poluescriptll oiv- 


404 




ri\ i 


ptJiuescnpiii oiv- 


405 


174^17-l-0-D6^CS 


PRT 


pPT 


406 


174-32-4-0-F8-CS 


PRT 


pPT 


407 


174-38-4-0-D11-CS 


PRT 


pPT 


408 


1 74-8-2 -0-C10-CS 


PRT 


pPT 


409 


179-14-2-0-F11-CS 


PRI 


pBluescnptll SK- 


! 410 


179_9_4_0-B8-CS 


PRT 


pBluescnptll SK- 


411 


181 -10-1 -0-C9-CS 


PRT 


pBluescnptll SK- 


412 


187-5-3-0-C7-CS 


PRT 


pBluescriptll SK- 
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413 


188-26-4-0-F5-CS 


PRT 


pBluescriptll SK- 


414 


188-27-3-0-G1-CS 


PRT 


pBluescriptll SK- 


415 


i 88-29-2 -0-H1-CS 


PRT 


pBluescriptll SK- 


416 


188-31-1-0-E6-CS 


PRT 


pBluescriptll SK- 


417 


188-45-1-0-D3-CS 


PRT 


pBluescriptll SK- 


418 


1 88-5-1 -0-H6-CS 


PRT 


pBluescriptll SK- 


419 


1 88-9-1 -0-C10-CS 


PRT 


pBluescriptll SK- 


420 


105-016-3-0-C5-CS 


PRT 


pBluescriptll SK- 


421 


105-026-4-0-D9-CS 


PRT 


pBluescriptll SK- 

X- A 


422 


105-053-2-0-D9-CS 


PRT 


pBluescriptll SK- 
*_ . „ , i 


423 


105-069-3-0-A11-CS 


PRT 


pBluescriptll SK- 


424 


105-076-4-0-F6-CS 


PRT 


pBluescriptll SK- 

*- ..XI 


425 


105-135-2-0-F9-CS 


PRT 


pBluescriptll SK- 


426 


106-023-4-0-F6-CS 


PRT 


pBluescriptll SK- 


427 


110-001-3-0-CT1-CS 


PRT 


pBluescriptll SK- 


428 


110-002-3-0-F9-CS 


PRT 


pBluescriptll SK- 


429 


114-019-3-0-D9-CS 


PRT 


pBluescnptll SK- 


430 


11 4-029- 1-0-C6-CS 


PRT 


pBluescnptll SK- i 


431 


114-032-4-0-B1-CS 


PRT 


pBluescriptll SK- 


432 


1 14-070-2 -0-H4-CS 


PRT 


pBluescriptll SK- 


433 


116-016-3-0-F11-CS 


PRT 


pBluescnptll SK- 


434 


116-022-4-0-G2-CS 


PRT 


pBluescriptll SK- 


435 


116-052-2-0-H8-CS 


PRT 


pBluescriptll SK- 


436 


116-053-4-0-B4.CS 


PRT 


pBluescriptll SK- 


| 437 


116-094-3-0-H2-CS 


PRT 


pBluescriptll SK- 


438 


116-112-4-0-C7-CS 


PRT 


pBluescriptll SK- 


439 


116-123-3-0-F12-CS 


PRT 


pBluescriptll SK- 


440 


1 23-008- 1-0-C5-CS 


PRT 


pBluescriptll SK- 


441 


145-53-2-0-H8-CS 


PRT 


pBluescriptll SK- 


442 


145-57-2-0-C9-CS.cor 


PRT 


pBluescriptll SK- 


443 


145-57-2-0-C9-CS.fr 


PRT 


pBluescriptll SK- 


444 


145-7-3-0-B12-CS 


PRT 


pBluescriptll SK- 


445 


157-12-2-0-D1-CS 


PRT 


pBluescriptll SK- 


446 


157-16-2-0-D5-CS 


PRT 


pBluescnptll SK- 


447 


157-1 8-2 -0-A7-CS 


PRT 


pBluescriptll SK- 


448 


160-103-1-0-BlO-CS 


PRT 


pBluescnptll SK- 


449 


160-1 04-4-0 -F3-CS 


PRT 


pBluescriptll SK- 


450 


1 60-22-2 -0-D10-CS 


PRT 


pBluescnptll SK- 

£- _Jt_ 


451 


160-24-3-0-F12-CS 


PRT 


pBluescnptll SK- 


452 


160-3-2-0-H3-CS 


PRT 


pBluescriptll SK- 

1-— — • r 


453 


160-58-2^0-A2-CS 


PRT 


pBluescnptll SK- 
r i 


454 


160-73-1 -0-B4-CS 


PRT 


pBluescriptll SK- 

£- i 


455 


1 60-75 -4-0-E6-CS 


PRT 


pBluescriptll SK- 


456 


1 60-97-3 -0-E9-CS 


PRT 


pBluescnptll SK- 


457 


174-1-4-0-E9-CS 


PRT 


pPT 


458 


174-12-4-0-C2-CS 


PRT 


pPT 


459 


180-19-4-0-H2-CS 


PRT 


pBluescriptll SK- 


460 


181-10-4-0-G12-CS 


PRT 


pBluescriptll SK- 


461 


181-3-2-0-F6-CS 


PRT 


pBluescriptll SK- 


462 


181-4 -4-0-A12-CS 


PRT 


pBluescriptll SK- 


463 


181-9-2-0-F12-CS.cor 


PRT 


pBluescnptll SK- 


464 


181-9-2-0-F12-CS.fr 


PRT 


pBluescnptll SK- | 
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46S 


1 84-1 7-^-0-F 1 1 -f 

1 0*+- 1 J*J V £ 1 1 v. o 


PRT 


po 1 ut scrip 11 1 oiv- 


466 




PR T 


pXJl UC^Cl lpill Ol\" 


467 


1 84-7-1 -0-F7-PS 


PRT 
I i\ 1 


nRln^crriritTT - 
px3iucocripii.i *z>iv- 


468 


1 84-8 -4-0 -OQ-pvl 


PRT 
1 rv 1 


nRluPcr-ri-ntTT Qk". 
pDiucbcripiii ljxV" 


46Q 


1 87-1 0-V0-OQ-PS 
j o 1 i\j j \j \jy v.- o 


PRT 


nR1nf»<?rrmtTT <sl<r- 

pX31UCbUI lpiil k3IV- 


470 


1 87-1?-0-0-m?0-PS 


PRT 


nR1ii<=»<;rrintTI SkT- 
pn>iuci>t/npiii oiv- 


471 


1 87-32-0-0-n2 1 -PS rnr 


PRT 

X XV X 


nRlnp^rnntTT SkT- 

pj_>lUCiUI ipill i3XV- 


47? 


1 R7-3?-0-0-n2 1 -PS fr 


PRT 

I XV X 


nRlnp«;rnntTT SK- 

pDllltoOl ipLXl l. j rv - 


47^ 


1 87-4-9 -0-F6-PS 


PRT 

I XV X 




474 


187-40-0-0-il5-CS 


PRT 


pBluescriptll SK- 


475 


187-47-0-0-g24-CS 


PRT 


pBluescnptll SK- | 


476 


187-9-3-0-A2-CS 


PRT 


pBluescriptll SK- 


477 


188-26-4-0-H1-CS 


PRT 


pBluescriptll SK- 


478 


188-35-3-0-G9-CS 


PRT 


pBluescriptll SK- 


479 


188-38-4-0-D8-CS 


PRT 


pBluescriptll SK- 


480 


! 88-41 -1-0-E6-CS 


PRT 


pBluescnptll SK- 


481 


188-42 -2 -0-F3-CS.cor 


PRT 


pBluescriptll SK- 


482 


188-42-2-0-F3-CS.fr 


PRT 


pBluescnptll SK- 
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Table II 



Seq Id No 


Full coding 
sequence 


Signal 
sequence 


Coding sequence 
for mature 
protein 


Polyadenylation 
signal 


Polyadenylation 
site 


1 


[169-1692] 


[169-249] 


[250-1692] 


[2126-2131] 


[2152-2201] 


2 


[148-1140] 


[148-240] 


[241-1140] 


[1592-1597] 


[1615-1631] 


3 


[85-906] 


[85-135] 


[136-906] 


[1159-1164] 


[1184-1245] 


4 


[31-1248] 


[31-135] 


[136-1248] 


None detected 


[1607-1623] 


5 


[72-143] 


[72-119] 


[120-143] 


[1416-1421] 


[1438-1454] 


6 


[111-1154] 


[111-197] 


[198-1154] 


[1602-1607] 


[1623-1639] 


7 


[66-1256] 


[66-173] 


[174-1256] 


None detected 


[1752-1768] 


8 


[190-1398] 


[190-252] 


[253-1398] 


[1470-1475] 


[1494-1510] 


9 


[78-410] 


[78-155] 


[156-410] 


None detected 


[866-882] 


10 


[84-299] 


[84-134] 


[135-299] 


[1814-1819] 


[1833-1849] 


11 


[55-468] 


[55-99] 


[100-468] 


[531-536] 


[549-565] 


12 


[152-475] 


[152-244] 


[245-475] 


[1623-1628] 


[1647-1663] 


13 


[112-552] 


[112-183] 


[184-552] 


[706-711] 


[729-744] 


14 


[101-1243] 


[101-199] 


[200-1243] 


[1720-1725] 


[1745-1759] 


15 


[101-517] 


[101-199] 


[200-517] 


[1716-1721] 


[1741-1755] 


16 


[59-853] 


[59-100] 


[101-853] 


[894-899] 


[922-936] 


17 


[73-672] 


[73-132] 


[133-672] 


[689-694] 


[711-747] 


18 


[94-1275] 


[94-210] 


[211-1275] 


[1849-1854] 


[1870-1884] 


19 


[42-515] 


[42-92] 


[93-515] 


[649-654] 


[677-691] 


20 


[271-969] 


[271-366] 


[367-969] 


[1093-1098] 


[1124-1138] 


21 


[76-276] 


[76-135] 


[136-276] 


[436-441] 


[455-468] 


22 


[6-287] 


[6-80] 


[81-287] 


[684-689] 


[706-720] 


23 


[171-692] 


[171-227] 


[228-692] 


[691-696] 


[713-727] 


24 


[137-454] 


[137-187] 


[188-454] 


[440-445] 


[456-470] 


25 


[238-609] 


[238-291] 


[292-609] 


[948-953] 


[973-987] 


26 


[80-862] 


[80-127] 


[128-862] 


[875-880] 


[894-908] 


27 


[83-310] 


[83-157] 


[158-310] 


[725-730] 


[748-762] 


28 


[310-906] 


[310-357] 


[358-906] 


[1071-1076] 


[1088-1102] 


29 


[24-287] 


[24-131] 


[132-287] 


[405-410] 


[422-436] 


30 


[132-1574] 


[132-206] 


[207-1574] 


[1899-1904] 


[1923-1938] 


31 


[117-545] 


[117-245] 


[246-545] 


None detected 


[1100-1116] 


32 


[117-362] 


none detected 


[117-362] 


None detected 


[1098-1114] 


33 


[144-1262] 


[144-224] 


[225-1262] 


[2035-2040] 


[2056-2072] 


34 


[35-316] 


[35-109] 


[110-316] 


None detected 


[393-409] 


35 


[177-767] 


[177-236] 


[237-767] 


None detected 


[822-836] 


36 


[208-1239] 


[208-294] 


[295-1239] 


None detected 


[1307-1323] 


37 


[60-1682] 


[60-143] 


[144-1682] 


None detected 


[1929-1945] 


38 


[198-998] 


[198-269] 


[270-998] 


[1292-1297] 


[1315-1330] j 


39 


[505-1590] 


[505-624] 


[625-1590] 


[2089-2094] 


[2108-2124] 


40 


[84-326] 


[84-146] 


[147-326] 


[1122-1127] 


[1142-1159] 


41 


[56-1678] 


[56-139] 


[140-1678] 


None detected 


[1936-1953] 
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42 


fl 19-15221 


[1 19-1811 


N 87-1 S771 

[ IOZ — 1 JZZ J 




N 671 -1 6881 

[lU / 11 OOO J 


43 


[334-15511 


[334_4261 

l J J ! 1 i\J j 


T477-1 SSI 1 

|_*tZ / — 1 J J 1 J 


Mnnp rlpti^ff^rl 


T107S-1 0471 


44 


[72-986] 


[72-149] 


[1S0-9861 

^ i jv/ youj 


11 608-161 31 


r 1 640-1 6S71 

[ 1 V7*TV/ 1 \J / J 


45 


[157-1482] 


[157-219] 


[770-14821 




11716-1 7331 


46 


[195-1052] 


[195-338] 


[339-10521 


TsJonp dt*tprtpH 


f 1 8S4-1 871 1 
L i o,j"t i o / i j 


47 


T21 7-14 101 


[217-279] 


[280-14101 


[1482-14871 


T1S07-1S361 


48 


f 103-4921 


[103-1621 


[163-4921 


T794-7991 


r81 5-8371 

1 O I J OJi. 1 


49 


[234-491] 


1234-2931 


[294-491] 


[793-798] 


T814-83 1 1 

1 O 1 — OJ 1 J 


50 


T 180-8001 


ri 80-2481 


T740-8001 


T880-88S1 


rooi -0i 7i 


5 1 


r 140-4721 


n 40-7 1 1 1 


T7 1 7-4771 
[Z 1 z— t / z J 


l\\J\\\. LICICC-LCCI 


T60S-67 1 1 

[UvJ -OZ 1 j 


52 


[68-4841 


[68-1 121 

[UO" 1 1 Z J 


ri 1 3-4841 




r6S7-6731 


53 


[38-5171 

^ J O" J 1 / J 


R8-1 1 81 

O 1 1 O j 


r 1 1 0-S 1 71 


P86f -8661 


r88S-8071 

1 OOJ"0 7 / J 


54 


[92-634] 


T07-1 301 


f 140-6341 






55 


[27-767] 


[77-801 

|_Z / -ov/j 


T8 1 -7671 


rvT/^v-F^ /~» / i /■» t r» t r» i 
INUllL Ut-UCUlCU 


r 103 1-1 0471 


56 


[4-3001 


T4-1 761 


T1 77-3001 


T801 -8061 


rOOO-0731 


57 


[127-879] 


n 77-1081 

[ iz / - 1 J OJ 


T 100-8701 


rWilKZ UCLCL-LCU. 


r 1 774-1 7401 

[ IZZH-lZHvj 


58 


r 1 S6 S661 


r 1 S6 7711 


T777 S661 

[ZZZ-JOOJ 


TS70 87S1 


rccc 0071 


SO 

j y 


ris-i 6S7i 

[ JJ 1 OJ 1 J 


ns 1 1 si 


ri i q 1 ^S71 




rios*; i q^qi 
[ iyj j-i yoyj 


60 


T77 Q171 
l / /-yj / J 


f77 1 771 
[//-iZ/J 




r i rtoc i i nn 


r i i i a i 1 t?i 

[11 lO-l 13ZJ 


61 


ro-son 


ro.i i 11 
|y-i i jj 


r 114 S031 


TSQ4-SQ01 


T61 S 6111 
[0 1 j-Oj 1 J 


67 


[71-4641 

|_Z 1 -< +0 £ tJ 


T71 0S1 
[Zl-yjJ 


rQ6-Zt6zll 


T6S0 6SS1 
[O JU-OJ D J 


T607 7771 

[oyz- /zzj 


63 


n 78-i osoi 

[ I / O- I V/J V/J 


T1 78-7701 

[ 1 / O-Z / 7J 


T780 10S01 


ri400-140Sl 


ri476 14471 


64 


ri7-774i 

L jz-z i *+j 


T37 1 781 


r 1 70 7741 


T7S6 76 1 1 
[ / JO- /O I J 


T770 70S1 
[ / /y- /yjj 


6S 


J_ZZZ- /Z. v/J 


[777 3111 


T1 1 7 O701 


n i Qi 11 061 


T1770 17161 
[ 1ZZU-IZJOJ 


66 


noi-issi 

[ iu 1 — J J J J 


rtfii aoi 


ri6i issi 


r777-7771 
[ / / Z / / / j 


T788 881 1 


67 


H73-4871 


r 1 7^-^01 1 

|_I / J - J U 1 J 


T107-4871 


T4S6-zlQ1 1 


rSOS S741 


KJO 


r? 1 0-1 0871 
[z i vj- i uozj 


T7 1 0 3111 

[Z 1 1 J 


rn 7 in87i 


rMi7 izH7i 

[ L 6 * JZ" l^f J / J 


rizlS6 14771 
[ i4jo- m- /zj 


60 


f 1 77-14401 


ri 77-7SS1 


T7S6 14401 


iNone ueiet/icu 


ri 77 1 -1 7171 
[1 / Z 1 - I / J/ J 


70 


[30-14771 


[10-771 


T78 1 4771 
L / o l hz / j 


n SOzl 1 S0O1 


ri 67 1 -16171 
L i oz i ioj / j 


71 


[30~i 1 7si 

[ J V~ 1 1 / J J 


[10-771 


T78-1 1 7S1 

[ / O- L 1 / JJ 


r 1 SOI 1 S081 


f 1 670-1 6161 

[ 1 OZU- 1 O JOJ 


72 


[66-8301 


r66-i 731 


T 1 74-8301 


INUIIC CICICLICU 


r 1 747-1 7S81 
L i / tz- 1 / j o j 


73 


[64-903] 


[64-1671 

^ 1 LIZ. j 


n 63-0031 


n 61 7-1 61 71 

[IUl Z" IUl / J 


r 1 631-16471 

L 1 V* J 1 1 U*T / J 


74 


[64-5851 


T64-1 671 


[163-S8S1 


[161 1-161 61 

1 1 U 1 i IUl \J J 


r 1630-1 6461 


75 


[274-753] 


[274-3241 


[325-7531 


[1031-10361 

1 1 / J 1*1 7 JUJ 


n 947-1 0631 


76 


r 191-14681 


[191-274] 


T275-14681 




[1741-17571 


77 


[48-950] 


[48-107] 


T 108-9501 


[1083-19881 


[201 1-20271 


78 


[156-5121 


[156-2061 


[207-5 121 


[183 1-18361 

[ 1 OJ i 10«yV7J 


T 1864-1 8801 

1 1 U V7 T 1 UUvJ 


79 


[67-351] 


[67-183] 


[1 84-3511 


XJon^ Af*tf*ct f*d 


T568-S841 


80 


[259-831] 


[259-375] 


[376-831] 


None detected 


[1337-1351] 


81 


[111-377] 


[111-233] 


[234-377] 


[689-694] 


[706-720] 


! 82 


[223-432] 


[223-336] 


[337-432] 


[986-991] 


[1015-1029] 


83 


[769-1272] 


[769-843] 


[844-1272] 


None detected 


[1774-1788] 


| 84 


[30-527] 


[30-74] 


[75-527] 


[738-743] 


[756-805] j 


85 


[39-506] 


[39-83] 


[84-506] 


None detected 


[800-814] 


86 


[115-429] 


[115-210] 


[211-429] 


[565-570] 


[584-598] 


87 


[332-574] 


[332-412] 


[413-574] 


None detected 


[630-699] 
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88 


[133^417] 


[133-213] 


[214-417] 


[876-881] 


[891-905] 


89 


[113-364] 


[113-172] 


[173-364] 


None detected 


[500-514] 


90 


[9-380] 


[9-104] 


[105-380] 


[483^88] 


[504-518] 


91 


[155-340] 


[155-292] 


I [293-340] 


[728-733] 


[754-808] 


92 


[185-634] 


[185-253] 


[254-634] 


[704-709] 


[723-737] 


93 


[53-646] 


[53-91] 


[92-646] 


[694-699] 


[714-728] 


94 


[247-510] 


[247-318] 


[319-510] 


[544-549] 


[568-582] 


95 


[143-592] 


[143-277] 


[278-592] 


[1877-1882] 


[1898-1913] 


96 


[33-458] 


[33-89] 


[90-458] 


[637-642] 


[654-670] 


I 97 


[1-336] 


[1-81] 


[82-336] 


[900-905] 


[923-939] 


98 


[174-443] 


[174-269] 


[270-443] 


[629-634] 


[647-661] 


99 


[282-521] 


[282-386] 


[387-521] 


[600-605] 


[631-647] 


100 


[251-643] 


[251-295] 


[296-643] 


None detected 


[990-1006] 


101 


[179-475] 


[179-295] 


[296-475] 


[995-1000] 


[1015-1059] 


102 


[34-327] 


[34-162] 


[163-327] 


[466^71] 


[498-514] 


103 


[303-953] 


[303-359] 


[360-953] 


[1124-1129] 


[1142-1158] 


104 


[97-645] 


[97-156] 


[157-645] 


[1524-1529] 


[1547-1563] 


105 


[80-820] 


[80-118] 


[119-820] 


[1587-1592] 


[1606-1621] 


106 


[77-388] 


[77-217] 


[218-388] 


[524-529] 


[541-557] 


107 


[139-513] 


[139-201] 


[202-513] 


[566-571] 


[584-600] 


108 


[81-986] 


[81-134] 


[135-986] 


[1092-1097] 


[1113-1129] 


109 


[266-586] 


[266-307] 


[308-586] 


[745-750] 


[762-778] 


110 


[59-745] 


[59-160] 


[161-745] 


None detected 


[1285-1301] 


111 


[59-676] 


[59-160] 


[161-676] 


None detected 


[1284-1300] 


112 


[15-278] 


[15-146] 


[147-278] 


[1580-1585] 


[1600-1617] 


113 


[167-619] 


[167-262] 


[263-619] 


[1598-1603] 


[1617-1634] 


114 


[223-417] 


[223-270] 


[271-417] 


[655-660] 


[677-693] 


115 


[166-732] 


[166-237] 


[238-732] 


[753-758] 


[768-784] 


116 


[75-623] 


[75-215] 


[216-623] 


[767-772] 


[788-804] 


117 


[30-335] 


[30-71] 


[72-335] 


[450-455] 


[468-484] 


118 


[21-752] 


[21-107] 


[108-752] 


None detected 


[970-985] 


119 


[185-715] 


[185-253] 


[254-715] 


[785-790] 


[814-839] 


120 


[54-527] 


[54-116] 


[1 17-527] 


[545-550] 


[567-583] 


121 


[129-686] 


[129-185] 


[186-686] 


[989-994] 


[1008-1024] 


122 


[165-614] 


[165-305] 


[306-614] 


[719-724] 


[744-760] 


123 


[192-476] 


[192-326] 


[327-476] 


[555-560] 


[578-594] 


124 


[16-297] 


[16-93] 


[94-297] 


None detected 


[543-559] 


125 


[216-635] 


[216-335] 


[336-635] 


[717-722] 


[728-744] 


126 


[164-280] 


[164-268] 


[269-280] 


[789-794] 


[809-824] 


127 


[68-301] 


[68-190] 


[191-301] 


[485-490] 


[510-526] 


128 


[179-427] 


[179-298] 


[299-427] 


[579-584] 


[602-618] 


129 


[22-297] 


[22-66] 


[67-297] 


[742-747] 


[760-776] 


130 


[9-845] 


[9-134] 


[135-845] 


[964-969] 


[983-998] 


131 


[27-578] 


[27-119] 


[120-578] 


[742-747] 


[763-779] 


132 


[408-710] 


[408-533] ! 


[534-710] 


[985-990] 


[1009-1025] 


133 


[247-501] 


[247-306] 


[307-501] 


None detected [592-607] 
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134 


[333-602] 


[333-416] 


[417-602] 


None detected 


[761-774] 


135 


[110-376] 


[110-208] 


[209-376] 


[582-587] 


[601-61 11 


136 


[22-417] 


[22-66] 


[67-417] 


[888-893] 


[909-925] 


137 


[62-367] 


[62-103] 


[104-367] 


[638-643] 


[658-674] 


138 


[107-1618] 


[107-178] 


[179-1618] 


[1688-1693] 


[1709-1725] 


139 


[16-471] 


[16-93] 


[94-471] 


None detected 


[1458-1474] 


140 


[222-374] 


[222-299] 


[300-374] 


None detected 


[637-653] 


141 


[59-274] 


[59-127] 


[128-274] 


[1452-1457] 


[1474-1490] 


142 


[158-442] 


[158-301] 


[302-442] 


[621-626] 


[645-661] 


143 


[5-454] 


[5-64] 


[65-454] 


[1745-1750] 


[1773-1789] 


144 


[241-1302] 


none detected 


[241-1302] 


[1968-1973] 


[1990-2006] 


145 


[15-635] 


none detected 


[15-635] 

L J 


[1057-1062] 


[1080-1096] 


146 


[109-738] 


none detected 


[109-738] 


[1633-1638] 


[1650-1666] 


147 


[21-1145] 


none detected 


[21-1 145] 


[1648-1653] 


[1666-1687] 


148 


[70-1596] 


none detected 


[70-1596] 


[1712-1717] 


[1733-1747] 


149 


[129-362] 


none detected 


[129-362] 


[597-602] 


[626-658] 


150 


[109-594] 


none detected 


[109-594] 


[1999-2004] 


T2029-20451 


151 


[150-587] 


none detected 


[150-587] 


None detected 


1772-7881 


152 


[173-847] 


none detected 


[173-847] 


[1894-1899] 


T1915-19311 


153 


[100-441] 


none detected 


[100-441] 


[479-484] 


T500-5141 


154 


[32-1132] 


none detected 


[32-1 132] 


None detected 


ri 167-1 1831 


155 


[160-996] 


none detected 


[160-996] 


[1504-15091 


f 1 529-1 5451 ! 


156 


[11-5291 

L 1 * —^ J 


none detected 


[1 1-5291 


[1042-1047] 


[1053-1068] 


157 


[135-749] 


none detected 


[135-749] 


r 1055-10601 

I V \S — ' — ^ A VVV J 


r 1081-10971 


158 


[98-637] 


none detected 


[98-637] 


[862-867] 


T878-8941 


159 


[22 1 -670] 


none detected 


[221-670] 


[669-674] 


T688-7031 


160 


[165-674] 


none detected 


[165-674] 


[808-813] 


[833-849] 


161 


[165-671] 


none detected 


[165-671] 


[805-810] 


[830-846] 


162 


[28-1128] 


none detected 


[28-1 128] 


[1 121-1 126] 


[1 159-1 176] 


163 


[135-194] 


none detected 


[135-194] 


[1050-1055] 


[1068-1084] 


164 


[173-847] 


none detected 


[173-847] 


[1757-1762] 


[1776-1793] 


165 


[8-1141] 


none detected 


[8-1141] 


None detected 


[1832-1849] 


166 


[136-264] 


none detected 


[136-264] 


[1720-1725] 


[1731-1748] 


167 


[14-1048] 


none detected 


[14-1048] 


[1234-1239] 


[1258-1275] 


168 


[70-777] 


none detected 


[70-777] 


[987-992] 


[1007-1023] 


169 


[38-400] 


none detected 


[38-400] 


[1043-1048] 


[1069-1085] 


170 


[63-572] 


none detected 


[63-572] 


[750-755] 


[767-776] 


171 


[160-867] 


none detected 


[160-867] 


[1178-1183] 


[1203-1219] 


172 


[68-640] 


none detected 


[68-640] 


None detected 


[1471-1487] 


173 


[132-1298] 


none detected 


[132-1298] 


[1873-1878] 


[1899-1915] 


174 


[259-1701] 


none detected 


[259-1701] 


None detected 


[1974-1990] 


175 


[213-1274] 


none detected 


[213-1274] 


[1940-1945] 


[1955-1971] 


176 


[68-127] 


none detected 


[68-127] 


None detected 


[1597-1613] 


177 


[65-1024] 


none detected 


[65-1024] 


[1291-1296] 


[1315-1361] 


178 


[109-585] 


none detected 


[109-585] 


[1059-1064] 


[1082-1113] 


179 


[29-577] 


none detected 


[29-577] 


[1917-1922] 


[1944-1960] 
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180 


[23-451] 


none detected 


[23-451] 


[1405-1410] 


[1427-1443] 


181 


[232-450] 


none detected 


[232450] 


None detected 


[589-605] 


182 


[758-1183] 


none detected 


[758-1183] 


None detected 


[1708-1724] 


183 


[486-932] 


none detected 


[486-932] 


None detected 


[1670-1686] 


: 184 


[80-304] 


none detected 


[80-304] 


None detected 


[452-463] 


185 


[188-691] 


none detected 


[188-691] 


[707-712] 


[727-773] 


186 


[94-573] 


none detected 


[94-573] 


None detected 


[739-753] 


187 


[181-462] 


none detected 


[181-462] 


None detected 


[740-754] 


i 188 


[6-290] 


none detected 


[6-290] 


None detected 


[971-998] 


189 


[115-411] 


none detected 


[115-411] 


[573-578] 


[591-605] | 


190 


[3-368] 


none detected 


[3-368] 


[481-486] 


[51 1-526] 


191 


[174-527] 


none detected 


[174-527] 


[878-883] 


[896-910] 


192 


[57-203] 

L J 


none detected 


[57-203] 


[579-584] 


[599-668] 


193 


[68-334] 


none detected 


[68-334] 


[562-567] 


[583-637] 


194 


[183-443] 


none detected 


[183-443] 


[670-675] 


[692-706] 


195 


[94-228] 


none detected 


[94-228] 


None detected 


[656-670] 


196 


[133-327] 


none detected 


[133-327] 


[465-470] 


[496-510] 


197 


[22-357] 


none detected 


[22-357] 


None detected 


[486-500] 


198 


[4-333] 


none detected 


! [4-3331 
i j 


[633-638] 


[653-667] 


199 


[1-363] 


none detected 


[1-363] 


[474-479] 


[498-514] 


200 


[41-337] 


none detected 


[41-337] 


None detected 


[401-462] 


201 


[1-551] 


none detected 


[1-551] 


None detected 


[535-551] 


202 


[34-315] 


none detected 


[34-315] 


None detected 


[534-550] 


203 


[1-315] 


none detected 


[1-315] 


[371-376] 


[392-408] 


204 


[94-582] 


none detected 


[94-582] 


None detected 


[651-665] 


205 


[540-923] 


none detected 


[540-923] 


None detected 


[994-1008] 


206 


[77-364] 


none detected 


[77_364] 


[367-372] 


[391-455] 


\ 207 


[65-544] 


none detected 


[65-544] 


[710-715] 


[733-749] 


208 


[117-467] 


none detected 


[117-467] 


[557-562] 


[578-594] i 


209 


[893-1897] 


none detected 


[893-1897] 


[2066-2071] 


[2082-2098] 


210 


[85-342] 


none detected 


[85-342] 


None detected 


[412-428] 


211 


[155-433] 


none detected 


[155-433] 


[713-718] 


[735-769] 


212 


[63-386] 


none detected 


[63-386] 


[878-883] 


[898-914] 


213 


[460-1290] 


none detected 


[460-1290] 


[1449-1454] 


[1473-1489] 


214 


[21-539] 


none detected 


[21-539] ; 


[741-746] 


[760-776] 


215 


[34-1143] 


none detected 


[34-1143] 


[1375-1380] 


[1397-1412] 


216 


[6-1184] 


none detected 


[6-1184] 


[1735-1740] 


[1744-1773] 


217 


[29-376] 


none detected 


[29-376] 


None detected 


[1184-1251] 


218 


[78-566] 


none detected 


[78-566] 


[858-863] 


[878-894] 


219 


[16-705] ; 


none detected 


[16-705] 


[868-873] 


[894-910] 


220 


[103-405] 


none detected 


[103-405] 


[482-487] 


[503-519] 


221 


[72-350] 


none detected 


[72-350] 


[593-598] 


[616-632] 


222 


[38-436] 


none detected 


[38-436] 


None detected 


[636-652] 


223 


[38-322] 


none detected 


[38-322] 


None detected 


[634-650] 


224 


[202-480] 


none detected 


[202-480] 


[472-477] 


[488-502] 


225 


[171-1670] 


none detected 


[171-1670] 


[1706-1711] j [1725-1739] 
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226 


[199-618] 


none detected 


[199-618] 


[626-631] 


[643-657] 


227 


[182-481] 


none detected 


[182-481] 


None detected 


F874-8881 


228 


[161-517] 


none detected 


[161-517] 


None detected 


[701-716] 


229 


[86-505] 


none detected 


[86-505] 


[618-623] 


r638-6541 


230 


T56-382] 

L J 


none detected 


[56-382] 


[598-603] 


[619-635] 


231 


[56-355] 


none detected 


[56-355] 


[597-602] 


f6 18-6341 


232 


[76-498] 


none detected 


[76-498] 


[546-551] 


[567-583] 


233 


[199-600] 


none detected 


[199-600] 


[705-710] 


r737-7531 


234 


[211-612] 


none detected 


[21 1-612] 


[717-722] 


[746-762] 


235 


[5-259] 


none detected 


[5-259] 


[502-507] 


[521-537] 


236 


[23-370] 


none detected 


[23-370] 


[956-961] 


[978-994] 


237 


[41-352] 


none detected 


[41-352] 


None detected 


[646-662] 


238 


[3-1319] 


none detected 


[3-1319] 


[1791-1796] 


[1813-1829] 


239 


[421-768] 


none detected 


[421-768] 


[1045-1050] 


[1067-1083] 


240 


[78-590] 


none detected 


[78-590] 


None detected 


[1815-1831] \ 


241 


[78-608] 


none detected 


[78-608] 


None detected 


[1814-1830] 
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Table III 



List of variants 

92; 1 1 9 
14; 15 
1 10; 111 
69;174;76 

2;12 
172;176; 177 
150; 152; 164; 166 
154; 162 
77; 143 
34;62 
23Q;231 
63;68 
8-47 
48;49;66 

7;72 
160; 161 
144; 175 
17;21 
31;32 
5;6 
3;10 
96;121 
37;41;59 
70;71 
19;24 
186;195;204 
73;74 
24Q;241 
221;235 
222;223 
42;45 
157;163 
19Q;229 
1 1 7;1 37 
122;233;234 
201;202 
80-739 
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Table IV 



Seq Id No 


Preferentially excluded fragments 


1 


192..235;2099..2201 


2 


174..225;1605..I631 


3 


1111. .1245 


4 


1590.. 1598; 1607.. 1623 


5 


1385.. 1453 


6 


1571.. 1639 


7 


1732.. 1768 


8 


1494.. 1510 


9 


570..882 


10 


1176.. 1218; 1710.. 1742; 1833.. 1849 


11 


219..253;455..565 


12 


178. .229;1636.. 1663 


13 


729..744 


14 


790. .827;1735.. 1759 


15 


788..825;1731..1755 


16 


922..936 


17 


668..747 


18 


1870.. 1884 


19 


677..691 


20 


1124.. 1138 


21 


450..468 


22 


393..411;706..720 


23 


713. .727 


24 


456..470 


25 


876..928;973..987 


26 


894..908 


27 


748..762 


28 


1088.. 1 102 


29 


422..436 


30 


1879.. 1918;1923.. 1938 


31 


774..1116 


32 


772..1 1 14 


33 


2056. .2072 


34 


393. .409 


35 


784..836 


36 


544.. 551;1307.. 1323 


37 


1867. .1874;1929.. 1945 


38 


1315. .1330 


39 


2108. .2124 


40 


413..42 1 ; 1 1 16.. 1159 


41 


1863.. 1870;1936.. 1953 


42 


1623.. 1688 
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43 


1895..1942 


44 


1640.. 1657 


45 


1661. .1733 


46 


1555. .1871 


! 47 


1507.. 1523 


48 


541, .832 


49 


540.. 831 


50 


901. .917 


| 51 


2..10;605..621 


52 


585. .673 


53 


885. .897 


54 


4..13;761..1101 


55 


1031. .1047 


56 


873..905;907..923 


57 


1224.. 1240 


58 


861. .902 


59 


1842.. 1849;1955.. 1969 


60 


1116.. 1132 


61 


15..46;615..631 


62 


651. .722 


63 


1426.. 1442 


64 


739..795 


65 


1220..1236 


66 


520..881 


67 


413..524 


68 


1444.. 1472 


69 


1721. .1737 


70 


1621..1637 


71 


1620.. 1636 


72 


777..784;1742..1758 


73 


1631..1647 


74 


1630. .1646 


75 


1947.. 1963 


76 


1741. ,1757 


77 


1 56 1 .. 1 9 1 3;20 1 1 -.2027 


78 


727..819;880..894;901.. 1280; 1841. .1880 


79 


418..584 | 


80 


331. .353;844..1214;1337.. 1351 


81 


706.. 720 


82 


639. .713;1008.. 1029 


83 


1454.. 1788 


84 


712.. 805 


85 


800.. 814 


86 


584..598 


87 


122..308;593..699 


88 


855. .905 
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89 


500. .514 


90 


81..101;198..205;504..518 


91 


650. .808 


92 


128..201;723..737 


1 93 


714.. 728 


94 


568. .582 


95 


1761. .1773;1898.. 1913 


96 


654. .670 


97 


883. .938 


98 


616. .661 


99 


631.. 647 


100 


853.. 1006 


101 


537. ,544;949.. 1059 


102 


498. .5 14 


103 


1 142.. 1 158 


104 


1524 1563 


105 


1230 1259- 1606 1621 


106 


505 557 


107 


584 600 


108 


378 385' 1113 1 129 


109 


729. .778 


1 10 


992. .1301 


111 


991. .1300 


1 12 


1131 11391569 1617 


1 13 


1526.. 1634 


114 


457 509 677 693 


1 15 


768. .784 


1 16 


360 670 788 804 


117 


435. .484 


1 18 


433..452;764..985 


119 


128..201;801..839 


120 


554..564;567..583 


121 


872.. 908; 1008.. 1024 


122 


744. .760 


123 


57S..594 


124 


94..102;248..559 


125 


728. .744 


126 


809. .824 


127 


5 10. .526 


128 


602. .618 


129 


472..553;569..776 


130 


983. .998 


131 


396..468;763..779 


132 


478. .532;1009.. 1025 


133 


592. .607 


134 


761. .774 
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135 


556..563;601..611 


136 


887..919 


137 


658..674 


138 


1651. .1725 


i 139 


49..71;988..1358;1458..1474 


140 


324..653 


141 


720. 730;1449.. 1490 


142 


44..119;498..505;578..585;645..661 


143 


1322..1666;1773..1789 


144 


1828.. 1897; 1919.. 1968; 1990..2006 


145 


936..955;1060..1096 


146 


778. .827;1650.. 1666 


147 


1 170. .1207;1647.. 1687 


148 


1733.. 1747 


149 


579..658 


150 


1432..1440;1728..1778;2004..2045 


151 


772.788 


152 


1 496.. 1 504; 1 792. . 1 842; 1915.1931 


153 


500.. 514 


154 


1167. .1 183 


155 


1529.. 1545 


156 


703.. 1068 


157 


873. ,881;1081.. 1097 


158 


878.. 894 


159 


688.703 


160 


833..849 


161 


830.. 846 


162 


1159. .1 176 


163 


869. .876;1068.. 1084 


164 


1444.. 1463; 1496.. 1 504; 1743.. 1 793 


165 


1233. .1319;1697.. 1849 


166 


1407.. 1426; 1459.. 1467; 1694.. 1 748 


167 


1258.. 1275 


168 


84..129;1002..1023 


169 


436..472;596..604;673. .689;732. .954,995. .1085 


170 


767.776 


171 


1203.. 1219 


172 


1411. .1487 


173 


1861.. 1915 


174 


1974.. 1990 


175 


1800.. 1869; 1891. .1940; 1955. .1971 


176 


1597.. 1613 


177 


186..212;1277..1361 


178 


930..978;1002..1113 


179 


95 1 1000; 1 364.. 1 533; 1 944.. 1 960 


180 


1427.. 1443 
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181 


107..181;276..31 1;449..605 


182 


1 143. .1450;1677.. 1724 


183 


1..251;648..655;1347..1686 


184 


447. .463 


185 


150..159;623..773 


186 


340..476;739..753 


187 


740. .754 


188 


307..315;668..998 


189 


118..125;529..536;591..605 


190 


492. .526 


191 


872. .910 


192 


525..668 


193 


91..135;461..637 


194 


392..458;551..671;692..706 


195 


656. .670 


196 


283..379;458..466;496..510 


197 


1 96:483 500 


198 


625. .667 


199 


474.. 513 


200 


370. .462 


201 


535. .551 


202 


534.. 550 


203 


374..408 


204 


651. .665 


205 


994.. 1008 


206 


348. .455 


207 


733. .749 


208 


1..49;578..594 


209 


2082. .2098 


210 


412. .428 


211 


689..769 


212 


898. .914 


213 


1266.. 1489 


214 


760.. 776 


215 


1304.. 131 1;1383.. 1412 


216 


648. .691;171l.. 1773 


217 


644.. 856;910.. 1251 


218 


87S..894 


219 


894. .910 


220 


503. .519 


221 


616..632 


222 


636..652 


223 


634..650 


224 


50..57;488..502 


225 


534. .577;1725.. 1739 


226 


643. .657 
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227 


L.84;874..888 


228 


701. .716 


229 


638. .654 


230 


263..573;619..635 


231 


263..573;619..635 


232 


567.. 583 


233 


737..753 


234 


746.762 


235 


499..537 


236 


905..912;944..994 


237 


348..662 


239 


829.. 1083 


240 


1508.. 1831 | 


241 


1507.. 1830 
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Table Va 



Seq Id 

No 


Preferentially excluded fragments 


Preferentially included fragments 


1 


[ 1 -540] ; [5 56-6 15]; [206 1 -2096] ; [2098-220 1 ] 


[541-555];[616-2060];[2097-2097] 


2 


[1-5 1 l];[533-6l9];[621-690];[730-l 132] 


[5 12-532];[620-620];[691-729];[l 133-1631] 


3 


[2-539];[l 178-1245] 


[1-1];[540-1 177] 


! 4 


[l-250];[297-383];[386-5141;[1025-1064] 


[251-296];[384-385];[515-1024];[1065-1623] 


5 


[27-1 16];[1 18-391] 


[1-26];[117-117];[392-1454] 


i 6 


[ 1 -93]; [96- 168];[ 1 70-262]; [264-46 1 ] 


[94-95];[169-I69];[263-263];[462-1639] 


7 


[i-95];[97-451] 


[96-96];[452-1768] 


8 


[1-502];[1314-1491] 


[503-1 3 1 3];[ 1492- 1510] 


9 


[1-864] 


[865-882] 


10 


[1-428] 


[429-1849] 


11 


[l-454];[482-514] 


[455-48 1];[5 15-565] 


12 


[l-375];[379-511];[533-690];[730-783];[814- 
1 164] 


[376-378];[512-532];[691-729];[784-813];[1165- 
1663] 


13 


[2-337];[339-556] 


[l-l];[338-338];[557-744] 


14 


[29-366];[368-507] 


[1 -28]; [367 367];[508- 1 759] 


15 


[29-366];[368-524] 


[l-28];[367-367];[525-1755] 


16 


[1-641] 


[642-936] 


17 


[l-708];[7 11-747] 


[709-710] 


18 


[1-639] 


[640-1884] 


19 


[1-631] 


[632-691] 


20 


[3 -4 16]; [4 18-490] 


[1-2];[417-417];[491-1138] 


21 


[1-468] 


None 


22 


[1-720] 


None 


23 


[1-711] 


[712-727] 


24 


[1-469] 


[470-470] 


25 


[l-231];[234-488] 


[232-233];[489-987] 


26 


[l-296];[300-642];[644-7371 


[297-299];[643-643];[738-908] 


27 


[l-306];[308-762] 


[307-307] 


28 


[1_446];[448-1102] 


[447-447] 


29 


[M36] 


None 


30 


[7-334];[1420-1468];[1474-1614];[1616- 
1804];[1845-19191 


[l-6];[335-1419];[1469-1473];[1615- 
1615];[1805-1844];[1920-1938] 


31 


[l-342];[345-519];[823-893];[977-1016] 


[343-344];[520-822];[894-976];[1017-l 1 16] 


32 


[1-517];[821-891];[975-1014] 


[518-820];[892-974];[1015-ll 14] 


33 


[36-352];[354-457];[728-832];[834- 

1096];[1253-1289];[1291-1350];[1352- 

1412];[1726-1873] 


[l-35];[353-353];[458-727];[833-833];[1097- 
1252];[1290-1290];[135 1-135 1];[14 13- 
1725];[1874-2072] 


34 


[1-4091 


None 


35 


[14-105] 


[l-l3];[106-836] 


36 


[1-572];[1 120-1271] 


[573-1 1 19];[1272-1323] 


37 


[20-98];[100-510];[1591 -168 !];[ 1683-1 870] 


[1-1 9];[99-99];[5 1 1 -1 590]; [ 1 682- 1 682]; [ 1 87 1 - 
1945] 
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38 


[1-547] 


[548-1330] 


39 


[1-445] 


[446-2124] 


40 


[l-473];[475-528] 


[474-474];[529-1159] 


41 


[16-506];[I587-1866] 


[1-15];[507-1586];[1867-1953] 


42 


[2-234];[244-451];[974-1226] 


[l-l];[235-243];[452-973];[1227-1688] 


43 


[1-455];[1670-1925] 


[456-1669];[ 1926-1 942] 


44 


[1-579];[815-1031] 


[5 80-8 1 4] ; [ 1 032- 1 65 7] 


45 


[1-489];[1012-1264] 


[490-101 1];[1265-1733] 


46 


[l-40O];tll84-l223];[1225-1705];[1740-1818] 


[401-1 1 83 ] ;[ 1 224- 1 224];[ 1 706- 1 739]; [ 1 8 1 9- 
1871] 


47 


[1-529];[1326-1505] 


[530- 1 325];[ 1 506-1 523] 


48 


[l-131];[133-510];[560-589] 


[132-132];[51i-559];[590-832] 


49 


[l-130];[132-509];[559-588] 


[131-131];[510-558];[589-831] 


50 


[l-650];[652-868];[873-913] 


[651-651];[869-872];[914-917] 


51 


[l-504];[5 15-605] 


[505-5 14];[606-621] 


52 


[1-535] 


[536-673] 


53 


[2-563] 


[l-l];[564-897] 


54 


[l-527];[802-870];[882-934];[966- 
1018];[1037-1080] 


[528-801 ];[871-881];[93S-965];[1019- 
1036];[1081-li01] 


55 


[l-326];[328-505] 


[327-327];[506-1047] 


i 56 


[1-340] 


[341-925] ] 


57 


[1-528] 


[529-1240] 


58 


[ 1 -1 08];[ 1 1 5-1 5 1 ];[ 1 54-340];[342-529] 


[109-1 14];[152-153];[341-341];[530-902] 1 


59 


[4-485];[1566-1656];[1658-1845] 


[ 1 -3]; [486- 1 565]; [ 1 65 7- 1 65 7] ;[ 1 846- 1 969] 


60 


[1-283] 


[284-1132] 


61 


[9-468] 


[l-8];[469-631] 


62 


[i-525];[689-722] 


[526-688] 


63 


[l-88];[90-192];[194-265];[296-409] 


[89-89];[193-193];[266-295];[410-1442] 


64 


[1-517] 


[518-795] 


65 


[l-406];[408-739] 


[407-407]; [740- 1236] 


66 


[l-489];[849-881] 


[490-848] 


67 


[1-505] 


[506-524] 


68 


[i-325];[328-441];[444-504] 


[326-327];[442-443];[505-1472] 


69 


[l-524];[636-715];[717-809];[81 1-885];[1567- 
1715] 


[525-635];[7 1 6-7 1 6] ; [8 1 0-8 1 0] ;[886- 
1566];[1716-1737] 


70 


[12-487] 


[1-11];[488-1637] 


71 


[12-487] 


[1-11];[488-1636] 


72 


[1-451] 


[452-1758] 


73 


[1-1 67]; [242-464] 


[168-241];[465-1647] 


74 


[l-167];[242-464] 


[168-241]; [465- 1646] 


75 


[1-471] 


[472-1963] 


76 


[l-358];[360-543];[655-734];[736-828];[830- 
904];[1586-1734] 


[359-359];[544-654];[735-735];[829-829];[905- 
1585];[1735-1757] 


77 


[3-34];[36-474];[582-770];[ 1 709-1 746];[ 1 748- 
1785];[1825-1899] 


[l-2];[35-35];[475-581];[771-1708];[1747- 
1747];[1786-1824];[1900-2027] 


78 


[l-75];[77-319];[914-1052];[1063- 
1126];[1 168-1203] 


[76-76];[320-913];[1053-1062];[1127- 
1167];[1204-1880] 
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79 [1-425] 


m7a SRdl 
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oU 
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yy 
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[1-3Z0], [5oz-yD4j; [y / o- 1 uo j j 


[jz i-oo i j,[yj j-y / jj,[ luuo- luuoj 


1 A 1 


n /icoi-r^Qi aaii.tiaia i acoi 

[ i-4oyj,po i -yoi j;[iuiu-iUjyj 


rdon ssoi rQA? inrkQi 


! 102 


[1-485] 


MCA C 1 /IT 


1 A*? 

103 


[1-540] 


[541-1158] 


104 


[1-556] 


^7 1 ^ An 

[JJ /"I JO-3J 


105 


[l-868];[870-1006] 


[869-869];[1007-1621] 


106 


[1-491] 


T/1A7 <^"71 

[49Z-JJ / ] 


107 


[1-573] 


[574-600] 


108 


[l-457];[586-l 1 10] 


[458-585];[l 111-1 129] 


109 


[l-521];f655-778] 


[522-654] 


1 10 


[1-416]; [478-6 1 4]; [6 1 6-990] ;[992- 
1 06si-n oar-1 ?r.ii 

1 VJU J J , [ 1 WUO-IZOJJ 


[4 1 /-4 / / J;[61 5-6 1 5 J;[9y 1 -99 1 J;[ 1066- 
1067Vri284-13011 


1 1 1 


l-416J;[4/8-614J;[6Zo-989J;[991- 
1 0641' T 1067- 12821 


f/1 1 7 4771 -TA1 ^ A771-TOOA OQAI-TI AA^ 

[4i /-4 / / j ,[o i j-oz / j,[yyu-yyuj,[ i uod- 
1 066] ; [ 1 283-1 


1 1 7 

1 iz 


7 47Q1-ri 1A1 17A71-N7 17 1lCC"!.n Ifil 1<ZCA1 

z-4zyj,[ i ioi - izuzj,[ lz i z-i 3ooj,[ i jyz- 1 joyj 


ri.ii-r4io iiAoi-ri7oi i7iii-ri^Ro 
1391];[1590-1617] 


113 


[1-487] j 


[488-1634] 


114 


[l-70];[86-496] j 


[71-85];[497-693] 


115 


[l-358];[360-558] 


[359-359];[559-784] 


116 


[t-215];[218-495];[527-607] 


[2 1 6-2 1 7];[496-526];[608-804] 


117 


[1-466] 


[467-484] 


118 


[l-515];[906-963] 


[5l6-905];[964-985] 


119 


[l-744];[746-816] 


[745-745];[817-839] 


120 


;i-85];[87-521] 


[86-86];[522-583] 
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121 


[1-532] 


[533-1024] 


122 


[1-318]; [325-5 1 7]; [567-660] 


[3 1 9-324];[5 1 8-566];[66 1 -760] 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[428-559] 


125 


[1-642] 


[643-744] 


126 


[l-341];[350-696] 


[342-349];[697-824] 


127 


[1-482] 


[483-526] 


128 


[1-338] 


[339-618] 


129 


[l-191];[193-429];[450-678] 


[192-192];[430^49];[679-776] 


130 


[19-463];[465-544] 


[l-18];[464-464];[545-998] 


131 


[1-470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


[499-607] 


134 


[l-168];[170-326];[328-471];[552-738] 


[169-169];[327-327];[472-551];[739-774] ! 


135 


[ 1 -346] ; [348-395 ] ; [440-473] 


[347-347];[396-439];[474-6t 1] 


136 


[l-324];[343-436] 


[325-342];[437-925] 


137 


[1-186];[188-251];[255~517] 


[187-187];[252-254];[5 18-674] 


1 Jo 


r 1 AQO~\ 

[1 -455] 


[489-1725] 


139 


[1-101 ];[ 103- 190]; [292-327] ;[ 1091- 
1 1611-ri228-.ni41 


[ 1 02- 1 02];[ 1 9 1 -29 1 ];[328- 1 090]; [ 1 1 62- 


1 AC\ 


[l-*fOjj,[j iO-OjJj 


[4ooo ID I 


1 A 1 


[ wo I J,[ /03-oj /J,[y Iz-lJZOj 


[762-762j;[558-91 1J;[1 327-1490] 




r i /itai 
[ l-^f /OJ 


T/i "7"7 /:/: i i 
[4/ /-OO 1 J 






[o3z-14/UJ;[l^U9-15U9J;[l 54o-15 ooj;[ 1662- 
1789] 


144 


[l-492];[503-536] 


[493-502];[537-2006] 


145 


[1-570] 


[571-1096] 


146 


[l-536];[621-703];[729-1075];[l 198-1445] 


[537-620];[704-728];[1076-l 197];[1446-1666] 


147 


[l-555];[578-628] 


[556-577];[629-1687] | 


148 


[1-444];[1201-1474];[1480-1516] 


[445-1200];[1475-1479];[l 5 17-1747] 


149 


[l-613];[626-658] 


[614-625] 


150 


[4- 199]; [20 1-4 19]; [42 1 -492] 


[ 1 -3 ] ; [200-200 ] ; [420-420 ] ; [493-2045 ] j 


151 


[1-509] 


[510-788] 


152 


[l-483];[485-578] 


[484-484];[579-1931] 


153 


[1-497] 


[498-514] 


154 


[5-509];[579-763];[765-1162] 


[ 1 -4] ; [5 1 0-5 78] ; [764-764] ; [ 1 1 63 - 1 183] 


155 


[1-486];[1095-1500] 


[487-1094];[ 1501-1545] 


156 


[l-488];[740-797];[799-884];[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-16l];[163-565];[567-701] 


[162-162];[566-566];[702-1097] 


158 


[l-496];[692-754] 


[497-691 ];[755-894] 


159 


[1-483] 


[484-703] | 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[l-505];[575-759];[761-1164] 


[506-574];[760-760];[l 165-1 176] j 


163 


[1-699] 


[700-1084] 


164 


[38-483];[485-556] 


[l-37];[484-484];[557-1793] 
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165 


ri-4261'f 1303-1 444H1 7 17- 17551- T 1787-1 8251 


F427-1 3021 •[ 1 445-1 7 1 61 -1" 17^6-1 7861- T 1826- 
1849] 


166 


[2-264] ; [266-446] ; [448-5 1 9] 


[ 1 - 1 ];[265-265];[447-447];[520- 1 748] 


167 


[l-519];[523-552] 


[520-522];[553-1275] 


168 


[M57];[466-571] 


[458-465];[572-1023] 


| 169 


[l-54];[57-501] 


[55-56];[502-1085] 


170 


[1-541] 


[542-776] 


171 


[1-489] 


[490-1219] 


172 


[l-538];[977-1468] 


[539-976];[1469-1487] 


173 


[1-631] 


[632-1915] 


174 


[21-776];[888-967];[969-1061];[1063- 
1137];[1819-1967] 


[l-20];[777-887];[968-968];[1062-1062];[1138- 
1818];[1968-1990] 


175 


[1-508] 


[509-1971] 


176 


[l-127];[129-538];[979-1470] 


[128-128];[539-978];[1471-1613] 


177 


[l-535];[973-1173];[1177-1330];[1332-1361] 


[536-972];[l 174-1 176];[133 1-133 1] 


178 


[l-599];[626-830];[1082-1 113] 


[600-625];[831-1081] 


179 


[1-623];[1377-1406] 


[624-1 376];[ 1407- 1960] 


180 


[l-414];[418-464] 


[415-417];[465-1443] 


181 


[l-522];[533-587] 


[523-532];[588-605] 


182 


[l-78];[99-131];[136-327];[l 153-1 184];[1210- 
1274];[1284-1319];[1385-1416] 


[79-98] ; [ 1 32- 1 35 ] ; [328- 1 152];[1 185- 
1209];[1275-1283];[1320-1384];[1417-1724] 


183 


[1-512];[617-805];[871-952];[1387- 
1422];[1621-1661] 


[513-616];[806-870];[953-1386];[1423- 
1620];[1662-1686] 


184 


[1-453] 


[454-463] 


185 


[1-773] 


None 


186 


[l-413];[423-604];[606-739] 


[414-422];[605-605];[740-753] 


187 


[1-1 17];[1 19-401] 


[118-118];[402-754] j 


188 


[l-511];[684-870];[872-928];[935-981] 


[512-683];[871-871];[929-934];[982-998] 


189 


[1-605] 


None 


190 


[2-475] 


[l-I];[476-526] 


191 


[1-910] 


None 


192 


[l-101];[103-668] 


[102-102] 


193 


[l-520];[583-637] 


[521-582] 


194 


[1-706] 


None 


195 


[l-145];[l50-451];[466-670] 


[146-149];[452-465] 


196 


[1-509] 


[510-510] 


197 


[1-500] 


None 


198 


[l-503];[505-585] 


[504-504];[586-667] 


199 


[1-498] 


[499-514] 


200 


[1-462] 


None 


201 


[1-551] 


None 


202 


[l_482];[484-550] 


[483-483] 


203 


[1-408] 


None 


204 


[l-519];[521-649] 


[520-52O];[650-665] ! 


205 


[ 1-261 ];[263-415];[417-640];[642-782] 


[262-262] ; [4 1 6-4 1 6] ; [64 1 -64 1 ]; [783- 1 008] 


206 


[1-455] 


None 
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i 207 


[l-402];[410-526] 


[403^09];[527-749] 


208 


[1-520] 


[521-594] = 


209 


[l-197];[200-472] 


[198-199];[473-2098] 


210 


[1-311]; [3 14-427] 


[312-313];[428-428] 


211 


[l-689];[735-769] 


[690-734] 


212 


[1-517] 


[518-914] 


213 


[2-576];[756-795];[1390-1441] 


[l-l];[577-755];[796-1389];[1442-1489] 


214 


[1-482] 


[483-776] 


215 


[1-498] 


[499-1412] 


216 


[1-505];[1000-1293];[1295-1408];[1744-1773] 


[506-999] ;[ 1 294- 1 294] ;[ 1 409- 1 743] 


217 


[l-102];[104-291];[293-467];[486-708];[723- 

831];[833-900];[910-1031];[1054- 
i aoa] • r 1 aq "7 i j cn 


[103-103];[292-292];[468-485];[709-722];[832- 
832];[901-909];[1032-1053];[1091-1096];[1154- 
1 Lj I \ 


218 


[1-452] 


[453-894] 


219 


[l-554];[556-598] 


[555-555];[599-910] 


220 


[l-38];[41-95];[98-386];[388-487] 


[39-40];[96-97];[387-387];[488-519] 


221 


[l-34];[38-220];[222-335];[337-518] 


[35-37];[221-221];[336-336];[519-632] 


222 


[1-468] 


[469-652] 


223 


[1-466] 


[467-650] 


224 


[1-466] 


[467-502] 


225 


[1-489]; [653- 1008] 


[490-652] ;[ 1009- 1739] 


226 


[1-657] 


None 


227 


[1-480] 


[481-888] 


228 


[1-501] 


[502-716] 


229 


[1-612] 


[613-654] 


230 


[l-477];[485-538] 


[478-484];[539-635] 


j 231 


[l-476];[484-537] 


[477-483];[538-634] 


232 


[1-367];[371-512] 


[368-370];[5 13-583] 


233 


[ 1-305] ;[307-442];[460-503];[553-646] 


[306-306]; [443-459];[504-552];[647-753] 


234 


[l-260];[262-345];[347-454];[473-515];[565- 
658] 


[26 1-261 ];[346-346];[455-472];[516-564];[659- 
762] 


235 


[1-427] 


[428-537] 


236 


[1-465] 


[466-994] 


237 


[l-471];[496-526];[557-587];[597-637] 


[472-495];[527-556];[588-596];[638-662] 


238 


[l-338];[352-497] 


[339-35 1];[498- 1 829] 


239 


[1-501] 


[502-1083] 


240 


[1-515];[1527 1 5 83]; [ 1 585- 1 687]; [ 1 692- 1 83 1 ] 


[516-1526];[1584-1584];[1688-1691] 


; 24i 


[1-515];[1526-1582];[1584-1686] ; [1691-1830] 


[516-1525];[1583-1583];[1687-1690] 
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Table Vb 



Seq Id No 


Preferentially excluded fragments 


Preferentially included fragments 


1 


[ 1 -540] ; [5 56-6 1 5 ]; [206 1 -2096] ; [2098-220 1 ] 


[54 1 -5 5 5 ]; [6 1 6-2060] ; [2097-2097] 


2 


[ 1 -5 1 1 ]; [5 33-6 1 9] ; [62 1 -690] ; [730- 1 1 32] 


[512-532];[620-620];[691-729];[l 133-1631] 


3 


[2-539];[l 178-1245] 


[M];[540-l 177] 


4 


[l-250];[297-383];[386-514];[1025-1064] 


[251-296];[384-385];[515-1024];[1065-1623] 


5 


[27-1 16];[1 18-391] 


[1-26];[1 17-117];[392-1454] 


6 


[ 1 -93 ] ; [96- 1 68] ; [ 1 70-262 ] ; [264-46 1 ] 


[94-95] ; [ 1 69- 1 69]; [263-263];[462- 1 639] 


7 


[l-95];[97-451] 


[96-96];[452-1768] 


8 


[1-502];[1314-1491] 


[503-13 13];[1492-1510] 


9 


[1-864] 


[865-882] 


10 


[1-428] 


[429-1849] 


11 


[1454];[482-514] 


[455481];[515-565] 


12 


[l-375];[379-511];[533-690];[730-783];[814- 
1164] 


[376-378];[512-532];[691-729];[784-813];[1165- 
1663] 


13 


[2-337];[339-556] 


[l-l];[338-338];[557-744] 


14 


[29-366];[368-507] 


[l-28];[367-367];[508-1759] 


15 


[29-366];[368-524] 


[l-28];[367-367];[525-1755] 


16 


[1-641] 


[642-936] 


17 


[l-708];[7 11-747] 


[709-710] 


18 


[1-639] 


[640-1884] 


19 


[1-631] 


[632-691] 


20 


[3416];[418490] 


[1-2];[417-417];[491-1 138] 


21 


[1-468] 


None 


22 


[1-720] 


None 


23 


[1-711] 


[712-727] 


24 


[1-469] 


[470470] 


25 


[1-231]; [234-488] 


[232-233];[489-987] 


26 


[ 1 -296] ; [300-642 ] ; [644-737] 


[297-299]; [643 -643] ; [73 8-908] 


27 


[l-306];[308-762] 


[307-307] 


28 


[1-446]; [448- 1102] 


[447447] 


29 


[1-436] 


None 


30 


[7-334]; [ 1420- 1 468]; [ 1 474- 1 6 1 4] ;[ 1 6 1 6- 
1804];[1845-1919] 


[l-6];[335-1419];[1469-1473];[1615-1615];[1805- 
1844];[1920-1938] 


31 


[ 1 -342];[345-5 1 9] ; [823-893] ; [977- 1016] 


343-344];[520-822];[894-976];[1017-l 1 16] 


32 


[1-517] ; [821-891];[975-1014] 


[518-820];[892-974];[1015-1114] 


33 


[36-352];[354-457];[728-832];[834- 

1096];[1253-1289];[129M350];[1352- 

1412];[1726-1873] 


[l-35];[353-353];[458-727];[833-833];[l097- 
1252];[1 290-1290];[135 1-135 1];[1413- 
1725];[1 874-2072] 


34 


[1409] 


None 


35 


[14-105] 


[l-13];[106-836] 


36 


[1-572];[1 120-1271] 


[573-1 1 19];[ 1272-1323] | 


37 


[20-98];[100-510];[ 1591-168 1];[1683-1 870] 


[ 1 - 1 9] ; [99-99] ; [5 1 1 - 1 590] ;[ 1 682- 1 682] ; [ 1 87 1 - 
1945] 


38 


[1-547] 


[548-1330] 
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39 


[1-445] 


[446-2124] 


40 


[l-473];[475-528] 


[474-474] ;[529- 1159] 


41 


[16-506];[ 1587-1866] 


[1-15];[507-1586];[ 1867-1953] 


42 


[2-234];[244-451];[974-1226] 


[l-l];[235-243];[452-973];[1227-1688] \ 


43 


[1-455];[1670-1925] 


[456-1669];[1926-1942] 


44 


[1-579];[815-1031] 


[580-8 1 4] ; [ 1 032- 1 657] 


45 


[1-489];[1012-1264] 


[490-101 1];[1265-1733] 


46 


[ 1 -400] ; [ 1 1 84- 1 223 ] ; [ 1 225- 1 705 ] ; [ 1 740- 

1 ft 1 ft! 


[40 1 - 1 1 83 ];[ 1 224- 1 224] ; [ 1 706- 1 739] ;[ 1 8 1 9- 

1 C7 1 1 


47 


[1-529];[1326-1505] 


[530-1 325] ;[ 1 506- 1 523] 


48 


rui in-ri ii-s i oirs60-SRQ] 


N 17-1 17WS 1 1 SSQ1KOO £171 


49 


ri-i iov n 17-soqwssq-ss.ri 


nil inwsin ssswsrq 8in ' 


so 


r 1 -6soif6s7 R6si-r£7i oin 


r6Si ^ii-ffiriQ ft77i roi/i Qi7i 

[03 1 -03 1 J ,[O0y-O /ZJ 3 [V 1 / J 


S 1 

J 1 


n -soziirs i s aosi 


[JUj-J I4J,[0U0-0z 1 J j 


S7 


n sisi 


[D.7D-0 / j J 


si 

•TO 


T7-S611 


[ I - 1 Jj[304-oy / J 


S4 
_)4 


n < N77i-rKn'"> ft7m.rcs7 qi/iwq^ 
1018];[1037-1080] 


pzo-oui j,[o / i-oo i j;[93j-yojj;[ 1019- 
1036];[1081-1 101] 


55 


n -i?6i-n?8.-S0Si 


n77-^77i-rsn£ mzi7i i 

pZ/OZ / J,[JVJO- 1U4 / J 


S6 


n -1401 


T~IA] 09^1 

[->*+i-yzjj 


57 




rs7Q 1 7J.m 


58 


fi-iORi ri i s-i sii-n s4-i40iri4?-S7Qi 


[ l uy-i 1 4j,[ i jz- 1 j^j,[ji4 1 -34 1 j,[ j ju-yuzj 


SQ 


['t-HOj J,[ I _>00-l 030J ,[ 1 OjO- i 0*4 J J 


[ i -3j,[4oo- iDODj^ ioj /-lod i o4o- 1 yoyj 




ri-?83i 


T7Rd 1 1 171 


61 


[9-468] 


L I -OJ ,[407-Oj j j 


62 


[l-525];[689-722] 


[526-688] 


o_> 


'i RSi ron ^Q'J^'[ )qa 7Asi-r7QA_/inQi 
i-oojjL-'u- i vzj,[ iy4-zojj,[zyo-4uyj 


rco QQi-rioi ion.r7^/i 7Qci.r/iin tAA^i 

Loy-oyj,[iy3-iy3j,[z()o-zyDj;[4iu-i44zj 




i-ji / j 


[j i o-/y3j 


6S 


N -Aft&l ldftR 71Q1 


[4U /-4U / J,[ /4U- 1 zJOJ 






[4yU-54oJ 


67 
o / 


N S0S1 


rcnA *;7/11 


vJO 


n i7si-n?K_44n-r4zLi snzii 

L 1 J , [JiO^r'T 1 J j [^rHH JUf J 


['X'y^^'xn^•\AAl-AA^\^'\ ^ ^c\ e \ 14771 

j/O-jZ / J,[44Z-443 J,[ JUJ - 14 / Zj 


69 


1" 1-5 241 -["636-71 5W7 1 7-R0QWK1 1 - 
885];[1567-1715] 


jzj-Ojj j,[ / 10-/ 1 OJ,[o 1 U-o i K)\ ,[OoO- 1 300 j, [ 1 / I 0- 

1737] 


70 




f 1 ii l-fzlSR 1 A171 
1-1 1 J, [40 0-1 Oj / J 


71 


[12-487] 


[1-11];[488-1636] 


77 




rvl*\7 17*\Q1 
[4JZ- I / JO] 


73 


[l-167];[242-464] 


[168-241];[465-1647] 


7/1 
/4 


[1-10/ J ,[Z4z-4o4J 


[ 1 68-24 1 J ;[465- 1 646J 


7S 


r 1-47 ii i 

[1-*+/ 1 j 


[4 / z- 1 yojj j 


76 


[l-358];[360-543];[655-734];[736-828];[830- 
904];[1586-1734] 


[359-359];[544-654];[735-735];[829-829];[905- | 
1585];[1735-1757] 


77 


[3-34];[36-474];[582-770];[1709- 
1746];[1748-1785];[1825-1899] 


[l-2];[35-35];[475-581];[771-1708];[1747- 
1 747];[ 1 786- 1 824] ; [ 1 900-2027] 


78 


[l-75];[77-319];[914-1052];[i063- 
1126];[1 168-1203] 


[76-76];[320-913];[1053-1062];[1127- 
1167];[ 1204- 1880] 


79 


[1-425] 


[426-584] 
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80 


[ 1 -752];[947- 1 0 1 7] ;[ 1 084- 1 1 70] 


r753-9461*ri0 1 8-1083ir 1 17 1-1 35 11 


81 


[l-496];[498-720] 


[497-497] 


82 


[1-324] 


[325-1029] 


83 


[l-477];[ 1474-1 529];[1537- 1 566];[1 577- 
1616];[1622-1662];[1717-1753] 


[478-1473];[l 530-1 536];[1567-1576];[ 1617- 
1621];[1663-1716];[1754-1788] 


84 


[l-496];[499-568];[752-805] 


[497-498];[569-751] 


85 


[1-527] 


[528-814] 


86 


[1-360] 


[361-598] j 


87 


[l-78];[80-583];[625-699] 


[79-79] ;[5 84-624] 


88 


[1-889] 


[890-905] 


89 


[1-513] 


[514-514] 


1 90 


[l-122];[124-155];[157-435];[437-517] 


[123-123];[1 56- 156];[436-436];[51 8-518] 


91 


[l-133];[165-808] 


[134-164] 


92 


[1-725] 


[726-737] 


93 


[1-4091 


[410-728] 


94 


n-33i] 


[332-582] 


95 


ri-4ioi 

L 1 T 1 v J 


[411-1913] 


96 


r 1-5011 


[502-670] 


i 97 


[1-141];[143-431] 


[142-142];[432-939] 


98 


r 1-1931 

L *^ J 


[194-661] 


99 


ri-629] 


[630-647] 


100 


[ 1 -5 20] ; [862 -954] ; [976- 1 005 ] 


[52l-861];[955-975];[1006-l006] 


101 


[1-489];[581-961];[1 010-1059] 


490-580];[962-1009] 


102 


[1-485] 


[486-514] 


103 


fl-5401 


f 541-1 1581 


104 


[1-556] 


[557-1563] 


105 


T 1 -8681 T870-1 0061 


T869-869V T 1007-1 621 1 1 

OU7-OU7 jj^lVVJ / - 1 1 J 


106 


[1-491] 


[492-557] 


107 


L 1 J /:> \ 


TS74-6001 


108 


n -4S7i-r^R6-i l im 


r4.SK SRS1T1 111 11 7Q1 1 


109 


f1-S21ir65 < i-7781 




1 10 


N -416W478-614l-r61 6-9901-r997- 
1065];[1068-1283] 


F41 7-4771 T6 1 S-61 S1T001 -QQ 1 1T 1066- 
1067];[1284-1301] 


1 1 1 


[1-416];[478-6141;[628-989];[991- 
1064];[1067-1282] 


[4 1 7 -477 1 [6 1 5 -627 1 • T990-9901 • T 1 065 - 
1066];[1283-1300] 


1 12 


[2-429];[l 161-1202];[1212-1388];[1392- 
1589] 


1-1];[430-I 1 60]; [1203-121 1];[ 1389- 1391 ];[ 1590- 
1617] 


113 


[1-487] 


[488-1634] 


114 


[l-70];[86-496] 


[71-85];[497-693] 


115 


[l-358];[360-558] 


[359-359];[559-784] 


116 


[l-215];[218-495];[527-607] 


[216-217], [496-526] ; [608-804 J 


117 


[1-466] 


[467-484] 


118 


[l-515];[906-963] 


[516-905];[964-985] 


119 


[l-744];[746-816] 


[745-745];[8 17-839] 


120 


[l-85];[87-521] 


[86-86];[522-583] 


121 


[1-532] 


[533-1024] 
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1 

122 


[ 1 -3 1 Oj,[ JZ5-_) 1 /J,[jO/-OOUj 


[Jiy-_5Z4J,[5 loOooJ,[061-76UJ 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[4ZO-55VJ 


125 


[1-642] 


r/^/ii "7/1/11 
[04 J- /44J 


126 


[1-341];[350-696J 


[342-349];[69 /-824J 


1 "0*7 

127 


r 1 a en 

[ 1-482J 


[483-526] 


128 


[1-338] 


mn /i oi 
[339-6 loj 


129 


[l-191];[193-429];[450-678] 


[192-192];[430-449];[679-776] 


130 


[19-463]; [465-544] 


[1-1 8];[464-464];[545-998] 


131 


[ I -470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


[499-607] 


134 


[l-168];[170-326];[328-471];[552-738] 


[169-169];[327-327];[472-551];[739-774] 


135 


[l-346];[348-395];[440-473] 


[347-347];[396-439];[474-61 1] 


136 


[1-324]; [343-436] 


[325-342];[437-925] 


137 


[ 1 -1 86]; [ 1 88-25 1 ];[255-5 1 7] 


[187-187];[252-254];[518-674] 


138 


[1-488] 


[489-1725] 


139 


[1-101];[103-190];[292-327];[1091- 
1161];[1228-1314] 


[102-102];[191-291];[328-1090];[1162- 
1227];[1315-1474] 


140 


[l-465];[5 16-653] 


[466-515] 


141 


[1-761];[763-857];[912-1326] 


[762-762];[858-9 1 1 ];[ 1 327-1490] 


| 142 


[1-476] 


[477-661] 


| 143 


[1-531];[1471-1508];[1510-1547];[1587- 
1661] 


[532-I470];[1509-1509];[1548-1586];[1662- 
1789] 


144 


[l-492];[503-536] 


[493-502];[537-2006] 


145 


[1-570] 


[571-1096] 


146 


[l-536];[621-703];[729-1075];[l 198-1445] 


[537-620];[704-728];[ 1076-1 1 97] ;[ 1446- 1666] 


147 


[l-555];[578-628] 


rrr/ r/^A 

[556-577];[629-l687] 


148 


[1-444];[1201-1474];[I480-15 16] 


[445-1 200];[ 1475-1479] ;[1 5 17-1747] 


149 


[1-61 3]; [626-658] 


[614-625] 


150 


r a 1 rvm nn 1 A 1 fll . T 1 /I mi 

[4- 1 99]; [20 1 -4 1 9J ; [42 1 -492] 


[l-3];[200-200];[420-420];[493-2045] 


1 c 1 

1 5 1 


r i caai 
[1-509] 


re 1 a toot 

510-788) 


152 


T1 AQT\.fAQC C701 

[ l-483J;[485-5 /8J 


' AO A A O A~\ .1 *Z~1C\ 1 m 1 1 

484-484J;[579-193 1 j 


ICO 


r i /i mi 
[ 1 -497J 


' /! fl O C 1 A 1 

498-5 14] 


1 C A 

1 54 


p-509J;[579-763];[765-l 162J 


l-4];[5 10-578];[764-764];[l 163-1 183] 


155 


[M86];[ 1095-1 500] 


[487-1094];[1501-1545] 


156 


[l-488];[740-797];[799-884];[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-161];[l63-565];[567-701] 


[162-162];[566-566];[702-1097] \ 


158 


[l-496];[692-754] 


[497-69 1];[755-894] 


159 


[1-483] 


[484-703] | 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[ 1 -505];[575-759];[76 1 - 1 164] 


[506-574];[760-760];[l 165-1176] 


163 


[1-699] 


[700-1084] ! 


164 


[38-483];[485-556] 


[l-37];[484-484];[557-1793] 


165 


[1-426];[1303-1444];[1717-1755]:[1787- 
1825] 


[427-1 302];[ 1445-1 716];[1 756-1 786];[ 1826- 
1849] 
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166 


[2-264];[266-446];[448-5 19] 


[l-l];[265-265];[447-447];[520-1748] 


167 


[l-519];[523-552] 


[520-522];[553-1275] 


168 


[1«457];[466-571] 


[458-465];[572-1023] 


169 


[l-54];[57-501] 


[55-56];[502-1085] 


170 


[1-541] 


[542-776] 


171 


[1-489] 


[490-1219] 


172 


[l-538];[977-1468] 


[539-976]; [1469- 1487] 


173 


[1-631] 


[632-1915] 


174 


[21-776];[888-967];[969-1061];[1063- 
1137];[1819-1967] 


[l-20];[777-887];[968-968];[1062-1062];[1138- 
1818];[1968-1990] 


i 175 


[1-508] 


[509-1971] 


176 


T 1 - I 771- T 179-S3Rir979-l 4701 


f 128-1 281- rs 19-9781- \\ 47 1-16131 


177 


r 1 -S^SlfQ71-l 17YI'N 177-1 330VN 332-1 3611 


rS36-97?Wl 1 74 1 1 7 6W133 1 -133 11 


1 78 

i. / o 


n -S991 f6?6-8TOVriOR?-l ! 131 


f 600-6751 f 83 1 -1 081 1 


179 


r 1 -6?3iri377-14061 

[ 1 "UZrJ 1,1 I J / f~l'-r\J\J\ 


T674-1 3761- T 1407-1 9601 


180 


N -414W41R-4641 


T41 S-41 71 -T46S- 14431 


1 R 1 

lOl 


n -S9?i-rs^n-SR7i 

[ I • JZ,XJ , [J J J -_J> O / J 


rS71-S171-fSRS-6iOSl 


1 R? 


n -7R1-FQ0-1 3 11-ri 36 1771TI IS^- 

^ 1 / OJ] J 1 J, l I ju-jz ' J>1 1 I J J 

1184];[1210-1274];[1284-1319];[1385-1416] 


r7Q-9RlT1 ^7-1 ^Sir37R-l 1 S71T1 1 8S-1 709W 1 77S- 
1283];[1320-1384];[1417-1724] 


183 


ri-512ir617-805Vr871-9521-ri387- 
1422];[1621-1661] 


T5 1 3-6 16ir806-870V [953- 13861 F 1423- ' 
1620];[1662-1686] 


184 


[1-453] 


[454_463] 


185 


[1-773] 


None 


186 


[1-413]; [423-604] ; [606-739] 


[414-422];[605-605];[740-753] 


187 


[1-1 17];[1 19-401] 


[118-1 18] ; [402-754] 


188 


[1-51 1];[684-870];[872-928];[935-981] 


[512-683];[871-871];[929-934];[982-998] 


189 


[1-605] 


None 


190 


[2-475] 


[l-l];[476-526] 


191 


[1-910] 


None 


192 


[l-101];[103-668] 


[102-102] 


193 


[l-520];[583-637] 


[521-582] 


194 


[1-706] 


None 


195 


[ 1 - 1 45] ; [ 1 50-45 1 ] ; [466-670] 


[146-149];[452-465] 


196 


[1-509] 


[510-510] 


197 


[1-500] 


None 


198 


[l-503];[505-585] 


[504-504];[586-667] 


199 


[1-498] 


[499-514] 


200 


[1-462] 


None 


201 


[1-551] 


None 


202 


[1-482]; [484-550] 


[483-483] 


203 


[1-4081 


None 


204 


[l-519];[521-649] 


[520-520];[650-665] 


205 


[ 1 -26 1 1;[263-41 5]; [4 1 7-640];[642-782] 


[262-262];[416-416];[641-641];[783-1008] 


206 


[1-455] 


None 


207 


[1-402]; [4 10-526] 


[403-409];[527-749] 


208 


[1-520] 


[521-594] 
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209 


[l-197];[200-472] 


[198-199];[473-2098] I 


210 


[1-31 1];[314-427] 


[312-313];[428-428] 


211 


[l-689];[735-769] 


[690-734] 


212 


[1-517] 


[518-914] 


213 


[2-576];[756-795];[1390-1441] 


[l-l];[577-755];[796-1389];[1442-1489] 


214 


[1-482] 


[483-776] 


215 


[1-498] 


[499-1412] 


216 


[ 1 -5 05 ] ; [ 1 000- 1 293] ;[ 1 295 - 1 408] ; [ 1 744- 
1773] 


[506-999];[ 1 294-1 294];[ 1 409- 1 743] 


7 1 7 

Z 1 / 


ri imi-rifi/i oon-noi a^h-taqk 7nci-nTi 
[1-1 UZJ,[ lU4-Zy 1 J,[ZVJ-4o / J,[4oo- /UoJ,[ 1 15- 

8311F833-900H910-103 1 If 1054- 
1090];[1097-1 153] 


rim 1 An.nm Tmi.r^/co /loci.nnA Tm.roTi 
[103- lU3J;lzyz-zyzJ;[4o5-4o5 J;[ /09-7zzJ;[83z- 

8321 f901 -9091 ri032-1053in 09 1-1 0961 -F1 1 54- 

1251] 


9 1 8 


r i -ds9i 


MSI 


Z 17 




[JJJO.J_)J, [3 1UJ 


zzu 


[ 1 OOJ,[*4 1 -!7JJ,[!70-jftOJ,L3oo- < +0 / J 


riQ AOl'TQA 071-ri$27 1C71-T/1CC <; 1 Ql 


99 1 
ZZ I 


n lAi ric 99m r999 n^i rii7 1 ci 


fK "271^771 7711-mA IIAl-T^ 1Q A"271 
[3D -3 /J,[ZZ 1 -ZZ 1 J, [3 JO~J_>Oj,[5 1 y-OJZJ 


777 

ZZZ 


[ 1 -405J 


MAG AS7 \ 


ZZJ 


[ 1 -4 00 J 


rA/;7 Asm 

[40 /-03UJ 


224 


[1-466] 


[467-502] ; 


ZZ J 


[l-45VJ,[OJ J-lUUoJ 


[490-65 zj;[ 1009- 1 739J 


! 77/; 

ZZo 


[ 1 -OJ /J 


None 


777 

ZZ / 


r t a oni 
[ l-4oUJ 


f"/1 O 1 OOOl 

[4o 1-oooJ 


ZZo 


[ 1-501 J 


rem "71/^. I 

[!>UZ- /loj \ 


77Q 

zzy 


L 1 -0 i zj 


[013-OJ4J 


7 in 

ZJU 


[ 1-4 / / J,[4oj-j3oJ 


[4/o-4o4J,[j J9-o3:>J 


231 


[l-476];[484-537] 


[477-483];[538-634] 


232 


[1-367];[371-512] 


[368-3 70];[5 13-583] j 


233 


[l-305];[307-442];[460-503];[553-646] 


[306-306];[443-459];[504-552];[647-753] 


234 


[l-260];[262-345];[347-454];[473-515];[565- 
658] 


[261-261];[346-346];[455-472];[516-564];[659- 
762] 


235 


[1-427] 


[428-537] | 


236 


[1-465] 


[466-994] 


237 


[l_471];[496-526];[557.587];[597-637] 


[472-495];[527-556];[588-596];[638-662] 


238 


[l-338];[352-497] 


[339-35 1];[498- 1829] 


239 


[1-501] 


[502-1083] 


240 


[1-515];[1527-1583];[1585-1687];[1692- 
1831] 


[516-1 526] ;[ 1 584- 1 584];[ 1 688-1 69 1 ] 


241 


[1-5 15];[1526-1582];[1584-1686];[ 1691- 
1830] ; 


[5 1 6-1 525];[ 1 583-1 583];[ 1 687-1 690] 
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Table VI 



Seq Id No 


Designation of domain 


Database 


Positions of 
domains 


242 


Cell attachment sequence 


PROSITE 


141-143 


242 


Peptidase family M20/M25/M40 


PFAM 


107-451 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


199-208 


744 


\ yl 1 f i~\ /■» t~i 'i 1 r*nrr~%f±1~ T"l T~/ \t i=> 1 Y"^ L.' 

1V1 1 lUk/J ILM lUl 1 d I C/dlllCI piUlCIIlJ) 




175;178-272 


244 


Mitochondrial energy transfer proteins . 


BLOCKSPLUS 


12-36 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


244 


Mitochondnal energy transfer proteins. 


BLOCKSPLUS 


131-144 


245 


Leucine zipper pattern 


PROSITE 


371-392 


249 


Leucine zipper pattern 


PROSITE 


20-41 


251 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


251 


Mitochondrial carrier proteins 


PFAM 


5-72 


251 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


12-36 


251 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


254 


Pancreatic nbonuclease family signature 


PROSITE 


63-69 


254 


Pancreatic ribonucleases 


PFAM 


26-143 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


49-69 


254 


Pancreatic ribonuclease family proteins. 


BLOCKSPLUS 


115-140 j 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


92-110 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


114-133 


254 


Pancreatic ribonuclease family proteins. 


BLOCKSPLUS 


30-40 


; 254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


114-137 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


69-86 


255 


L-lactate dehydrogenase active site 


PROSITE 


239-245 


255 


lactate/malate dehydrogenase 


PFAM 


71-380 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


186-224 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


96-121 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-102 | 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


238-256 | 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


183-203 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


288-323 ; 
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255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


207-224 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-92 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


138-167 


256 


lactate/malate dehydrogenase 


PFAM 


71-124 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


96-121 


256 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-102 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-92 


256 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


71-100 


256 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


71-84 


257 


Leucine zipper pattern 


PROSITE 


155-176 


259 


HORMA domain 


PFAM 


22-230 


261 


Leucine zipper pattern 


PROSITE 


142-163 


261 


Leucine zipper pattern 


PROSITE 


170-191 


263 


Leucine zipper pattern 


PROSITE 


15-36 


264 


Ubiquitin family 


PFAM 


1-82 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


17-62 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


21-68 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


26-68 


264 


Ubiquitin domain proteins. 


BLOCKSPLUS 


17-68 


266 


u-PAR/Ly-6 domain 


PFAM 


60-119 


266 


Squash family of serine protease inhibit 


PFAM 


32-47 


267 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


185-202 


271 


LBP / BPI / CETP family signature 


PROSITE 


28-60 


271 


Pyrokinins signature 


PROSITE 


324-328 


271 


LBP / BPI / CETP family 


PFAM 


10-479 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


72-118 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


209-253 


Z / I 


■Lor / Dri / v^ii i r iamiiy proteins. 




TO CO 

Z0O5 


271 


LBP / BPI / CETP family proteins. 


BLOCKSPLUS 


275-309 


Z / I 


Lor / ori / Lc l r iamiiy proteins. 




It 1 1 0 

lb- 1 1 3 


777 

Z f Z 


z.inc linger, i_oriv^*t typ e vKJJNvj linger), 
signature 




102- 1 1 1 


212 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


87-129 


272 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


102-111 


273 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


30-39 


273 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


15-57 


273 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


30-39 


274 


RNA 3'-terminal phosphate cyclase signature 


PROSITE 


157-167 


274 


RNA 3'-terminal phosphate cyclase 


PFAM 


1-368 
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274 


RNA 3'-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


12-44 


274 


RNA 3 '-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


157-168 


275 


Ribosomal L27 protein 


PFAM 


31-86 


277 


Cell attachment sequence 


PROSITE 


292-294 


277 


DHHC zinc finger domain 


PFAM 


140-204 


279 


Endogenous opioids neuropeptides precursors 
signature 


PROSITE 


26-65 


279 


Vertebrate endogenous opioids neurope 


PFAM 


3-257 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


100-126 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


209-237 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BI OCKSPI US 


43-66 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


18-38 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


24-36 


z /y 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


105-125 


zoU 


Leucine zipper pattern 


PROSITE 


136-157 


ZOU 


Leucine zipper pattern 


PROSITE 


272-293 


283 


Immunoglobulins and major histocompatibility 

pninnlpY nrntpinQ citmatiirp 


T">T» /~V O ITT 1 

PROSITE 


380-386 \ 


283 


Immunoglobulin domain 


PFAM 


205-285;318- 
384 


283 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


284 


Fucosyl transferase 


PFAM 


70-406 


285 


FAD/NAD -binding Cytochrome reductase 


PFAM 


27-149 


285 


Oxidoreductase FAD/NAD-binding domain 


PFAM 


176-290 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


58-86 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


75-86 


285 


CYTOCHROME B5 REDLJCTASE 
SIGNATURE 


BLOCKSPLUS 


274-283 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


141-156 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


274-286 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


60-85 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


181-198 


285 


FLAVOPROTEIN PYRIDINE NUCLEOTIDE 
CYTOCHROME REDUCTASE SIGNATURE 


BLOCKSPLUS 


181-197 
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286 


Tmmi inocx Infn 1 1 inc anH msiinr Vi i ctripr*mr»ritiV>i1itA/ 

11 111 1 1141 lU^HJ U Li 1 1 1 lo dllvi llluJV^l 1 1 1 JlUt (JI 1 1 L/tl LI u i n I y 

complex proteins signature 


PRO^TTF 
r/wjoi 1 1 ' 


JOv-J oO 


286 


Immunoglobulin domain 


PFAM 


205~285;318- 


286 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


287 


Leucine zipper pattern 


PROSITE 


126-147 


288 


Leucine zipper pattern 


PROSITE 


20-41 


291 


Tissue inhibitors of metal loproteinases 
signature 


PROSITE 


24-36 


291 


Tissue inhibitor of metalloproteinases 


PFAM 


22-199 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


21-46 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


106-148 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


81-95 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


61-72 


294 


Domain of unknown function DUF59 


PFAM 


31-135 


296 


Immunoglobulin domain 


PFAM 


141-197 


297 


TonB-dependent receptor proteins signature 1 


PROSITE 


1-42 


298 


Fibroblast growth factor 


PFAM 


48-129 


299 


BolA-Iike protein 


PFAM 


39-114 


299 


PROTEIN BOLA TRANSCRIPTION 
RFGIJI ATION AC 


BLOCKSPLUS 


68-98 


301 


Cell attachment sequence 


PROSITE 


172-174 


303 


Ribosomal L27 protein 


PFAM 


31-115 


304 


Leucine rich repeat C-terminal domain 


PFAM 


173-222 


304 


Leucine Rich Repeat 


PFAM 


92-1 15; 1 16- 

139; 140- 
163;164-185 


309 


Leucine rich repeat C-terminal domain 


PFAM 


173-222 


309 


Leucine Rich Repeat 


PFAM 


92-115;116- 
139; 140- 


311 


NOLl/NOP2/sun family 


PFAM 


201-276 353- 
378 


311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


312 


NOLl/NOP2/sun family 


PFAM 


201-276 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


314 


Leucine zipper pattern 


PROSITE 


8-29 


315 


Leucine zipper pattern 


PROSITE 


8-29 


341 


Immunoglobulin domain 


PFAM 


45-112 


349 


CDP-alcohol phosphatidyltransferases signature 


PROSITE 


54-76 


349 


Cytochrome b/b6 Qo site signature 


PROSITE 


97-102 


354 


SAM domain (Stenle alpha motif) 


PFAM 


82-147 


361 


Ribosomal Proteins L2 


PFAM 


96-124 


368 


DAD family 


PFAM 


1-78 


370 


Ribosomal protein L34 


PFAM 


51-92 
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385 


Kelch motif 


PFAM 


20-66;68- 
114;116- 
162;164- 
zU9;Zl 1- 
265-270-316 


386 


SPRY domain 


PFAM 


85-205 


388 


PHD-finger. 


BLOCKSPLUS 


329-339 


389 


Eukaryotic thiol (cysteine) proteases histidine 
active site 


PROSITE 


268-278 


389 


Heat shock hsp70 proteins family signature 3 


PROSITE 


332-346 


389 


Hsp70 protein 


PFAM 


3-509 


390 


Eukaryotic -type carbonic anhydrase 


PFAM 


20-59 


391 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-162 


392 


Sec 1 family. 


BLOCKSPLUS 


89-107 


393 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


394 


Mvc-tvoe 'helix-loon-helix 1 dimeri7ation 
domain signature 


PROSITE 


13-28 


395 


Glutathione S -transferases. 


PFAM 


47-122;260-309 


396 


Transmembrane 4 family signature 


PROSITE 


112-134 


396 


Transmpmhran^ A fnmilv 

i i a. I lo 1 1 IV- 1 1 1 L/I dl i la.ll 11 iy 


PFAM 


66-273 


396 


Transmembrane 4 family proteins. 


BLOCKSPLUS 


108-146 


Jyyj 


POT IP FAMTT V 

SIGNATURE 


RT OPT^^PT TT<s 


1 9Q 1 5 1 




td AlSl^K-fFA/TRP AMP POT TR P ATVfTT V 

SIGNATURE 




1 08 1 77 
l Uo- 1 z / 


396 


TRANSMFMRRAMF POT IR FA MIT Y 
SIGNATURE 


RT OPKSPT TN 


247-274 


396 


TRANSMEMBRANE FOUR FAMII Y 
SIGNATURE 


BI OCKSPLUS 


129-150 


396 


TRANSMEMBRANE FOUR FAMILY 

j. i.^-* u luirii^iriuiv viiiv a \y ^iv x / iitilli i 

SIGNATURE 


BLOCKSPLUS 


128-154 


397 


ATP/GTP-binding site motif A (P-loop) 


PROSITE 


6-13 


397 


ADP-ribosylation factor family 


PFAM 


2-172 


398 


Isochorismatase family 


PFAM 


17-147 


399 ; 


PAP2 super family 


PFAM 


19-175 


400 


Zinc carboxypeptidases, zinc-binding region 2 
signature 


PROSITE 


117-127 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


36-57 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


73-93 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


1 14-134 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


145-165 


401 


Zinc finger, C2H2 type 


PFAM 


34-57;71- 
93;112-134;143- 
165 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


145-162 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


114-131 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


73-90 


402 


Zinc finger, C2H2 type, domain 


PROSITE 


113-133 | 
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402 


Zinc finger, C2H2 type, domain 


PROSITE 


144-164 


402 


Regulator of chromosome condensation 
(RCC1) signature 2 


PROSITE 


65-75 


402 


7inr fincrpr (~ % 'JU'7 tvnp 


PFAM 


111 1 IV 1 A") 
1 1 1 - 1 J J, l HZ- 

164 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


144-161 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


113-130 


403 


Glutathione S-transferases. 


PFAM 


47-122;260-309 


405 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


| 406 


WD domain, G-beta repeat 


PFAM 


267-304;333- 
370 


408 


Rhomboid family 


PFAM 


186-323 


410 


Ank repeat 


PFAM 


47-79 \ 


410 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


BLOCKSPLUS 


78-89 


410 


Ank repeat proteins. 


BLOCKSPLUS 


48-56 


412 


Serine proteases, subtilase family, aspartic acid 
proteins. 


BLOCKSPLUS 


165-178 


414 


Sir2 family 


PFAM 


84-268 


416 


Kelch motif 


PFAM 


20-66;68- 
114;116- 
162; 164- 
209;2l l- 
26S- 270-^1 6 


418 


Zinc-binding dehydrogenases 


PFAM 


16-313 


426 


Leucine zipper pattern 


PROSITE 


144-165 


447 


Cytochrome c family heme-binding site 
signature 


PROSITE 


19-24 


447 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


17-23 


453 


eIF-6 family 


PFAM 


3-103 


454 


Cell attachment sequence 


PROSITE 


226-228 


456 


Leucine zipper pattern 


PROSITE 


211-232 


457 


Leucine zipper pattern 


PROSITE 


236-257 


466 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


56-65 


466 


SPRY domain 


PFAM 


375-500 


466 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


41-81 


466 


B-box zinc finger. 


PFAM 


110-153 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


359-381 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


443-457 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


359-380 


466 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


56-65 


479 


UBX domain 


PFAM 


329-408 


481 


TBC domain 


PFAM 


65-171 


481 


Probable rabGAP domain proteins. 


BLOCKSPLUS 


153-159 


482 


TBC domain 


PFAM 


65-177 
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482 



Probable rabGAP domain proteins. 



BLOCKSPLUS 



153-159 
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Table VII 



-J V LI 111 ±^1J 




242 


98..109;119..127;136..147;156..170;242..248;255..265;318..32 
8*^S6 163-30Q 407 443 4S0-47S 490 


243 


3 9-59 65 69 791 13 126*142 155-193 198 212 220-231 24 
5;302..315 


■ 244 


29..36;33..42;79..87;139..147;269..274 


245 


101. .107; 141.. 15 1;156..165;196..207;225..233;242..251;253..2 
60;284..298;323..330;339..347;395..406 


247 


4 1 ..5 1 ; 1 08.. 120; 12 1 .. 1 3 1 ; 1 90..200;255..261 ;302..307 


248 


5..1 1;38..46;52..60;75..83;92..99;133..150;167..183;187..200;2 
10..219;244..252;270..286;335..345;354..371;390..397 


249 


68..80;91..99;132..138;185..193;265..273;276..293;295..306;30 
5..329;327..341;347..358;394..403 


250 


28..37;60..67;73..81 


251 


33 45-64 71 


252 


20 30 35 45 49 59 74 83 


253 


3..9;59..65 


254 


22 33*35 52-53 67*70 77*80 100*106 117-142 147 


255 


116 123147 156 201 208 262 278 


256 


10 15*116 121 


257 


41..51;52..66;72..80;94..101;120..127;134..147;180..193;204..2 
10;227..240 


258 


147 157189 199 


259 


52 59*66 76 103 113 1 15 127131 140 143 148*181 199*24 
2..250;253..262;262..273;279..289;330..341;342..366;373..394 


! 260 


94.107;! 12..1 19;125..134 


261 


121..126;144..152;213..224 


263 


44..50 


264 


51..58;82..90;153..164 


266 


15..20;38..49;76..81;95..I05 


267 


74..91;94..99;1 17.. 130; 140..154;! 53..161;175..184;201..210;22 
8..240;250..255 


268 


36..42;43..54 


269 


41..46;64..73;80..100;106..122;160..172 


270 


38..48;82..88 


271 


34..40;72..79;111..123;146..153;251..259;307..314;316..322;37 
2..377;436..444 


272 


12..17;51..58;75..85;128..136 


273 


4..13;56..64 


274 


34..46;120..127;157..163;182..191;23l..240;259..267;273..279; 
291..299;344..355 


275 


30..55;72..78 


276 


27..35;37..45;49..61;61..77;102..I09;144..152;170..180;179..18 
8 


277 


61 ..67; 147.. 1 52; 1 54.. 166;284..299;308..3 13 


278 


72..82;45I..461;532..541 


279 


24„31;72..84;83..92;97..U1;144..149;161..182;181..189;192..1 
98;204..214;216..233;241..254;256..263 
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280 


15..20;50..56;61 ..66;204..2 12;25 1 ..260;354..362 


281 


24-38;44..52 


282 


72..82;45I..461;532..54! 


283 


L.6;21..31;77..83;115..l20;228..237;276..281;335..343;401..40 
7;440..456;456..468 


284 


4..9;39..47;50..66;78..94;111..122;132..141;169..174;190..202; 
213..220;243..252;261..274;282..300;369..376;379..389;395..4 
03 


285 


29..38;42..47;58..65;100..U0;I21..134;156..161;161..173;201.. 
207;230..239;243..254;29Q..302 


286 


L.6;2L,28;77..83;115..120;228..237;276..281;335..343;401,.40 
7 


287 


2..10;94..104;248..258;268..286 


288 


68..80;91..99;132..138;185..194;262..270;273..288;291..301;30 
0..324;322..336;342..353;389..398 


289 


23.38 


291 


28„35;96..104;134..144;159..167;177..187;191..198 


292 


L.7;56..64;66..73;77..92 


293 


40..45;99..109 


294 


47..57;120..126 


295 


31. .6I;76..82;143..149;156..169 


296 


133..143;151..156;161..167;169..181;185..194 


297 


50..58;59..69;113..123;120..137 


298 


45,.55;52..63;106..117;118..128;126..131;148..155;157..164;17 
2..190;212..221;232..247 


299 


5L.59;82..87;113..!25;124..135 


300 


72..82;451..461;532..541 


301 


43..52;88..105;192..211;255..271 


302 


3..18;37..44;57..65;70..76;98..113;12L.134 


303 


30..55;72..77;82..88;106..113 


304 


2..11;33..42;48..54;55..63;122..131;I47..154;168..180;200..209 
;211..220;226..233;268..278;286..291 


305 


22. .31 


306 


5..11;25..35;72..8I;124..134;147..157;163..178;177..186;185..1 
95;207..217 


307 


23. .38 


308 


66..72;84..100 


309 


2..11;33..42;48..54;55..63;122..131;147..154;168..180;200..209 
;2 1 1 ..220;226..233;268..278;286..291 


310 


45..52;60..68;88..94;99..109;113..120;121..134;162..171;169..1 
84;194..202;209..215;223..235;239..248;273..28I;292..30I;319 
..329;336..341;389..394;398..405;421..426 


311 


15..21;28..35;82..91;113..120;125..133;153..167;236..243;291.. 
298;307..312;316..327;390..396;406..413;436..457 


312 


15..2t;28..35;82..9l;113..120;125..133;153..167;236..243;291.. 
298;307..312;316..327;352..370;370..382 


313 


38..46;52..60;75..83;92..99;133..150;167..183;187..200;210..21 
9;239..256 


314 


36..42;52. . 58;65..70;80..87; 143. .155; 161. .168; 1 76.. 185;203.. 20 
8;263..272 


315 


36..42;52..58;65..70;80..87;143..155;161..168 ! 


316 


33. .47;49.. 58:106. . 1 1 7; 125.. 132 
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1 1 7 


4.S ^"V^A <Q-8S Q4-0Q 10Q-1M 170-191 \'\A'\f\l 1 7 1 • 1 fsQ 1 
84-194 207209 215 221 23V219 248*271 ^81-292 301*319 
..329;336..341;389..394;398..405;421..426 


j 318 


42..54;69..77;124..130;148..153;156..165;186..210;219..227;27 
1..286;293..300 


319 


15..21;25..37;36..59;58..64;80..89;86..102;105..119 


! 320 


1..6;81..89 


| 321 


111..116 


322 


1..21;50..68;76..85 


323 


11..16;49..68 


325 


14..20;40..55;69..76;I22..131;128..138;158..166 


326 


18..45;61..68;8L.89;110..14l;142..149 | 


327 


43..48;53..62;85.,95 


328 


4..9;34..46 


329 


1..7;58..69 


330 


34..42;46..52;56..66 


331 


59..69;72..80;80..89;90..98;110,.1 15 


332 


14..24;50..56 


333 


1..6;41..47;86..96;120..134 


334 


17..23;41..47;50..57;85..90;96..105;154..159;181..192;192..198 


335 


4..9;21..28 


336 


49..65;70..80 


337 


20..30;36..42;53..61;74..83;110..119;125..138;137..142 


i 338 


21..55;55..65;62..81;88..100;99..107 


339 


33..47;52..60;73..82 


340 


5..12;13..21;32..51;65..72 


341 


19..30;44..52;51..61;68..82;96..108 | 


342 


18..26;44..54;71..77 


343 


21..33;57..66;68..77;80..89;91..98 


344 


36..44;89..95;110..116;181..193 


345 


19..28;34..41;43..55;80..86;119„124;159..165;167..176 


346 


26..35;39..49;88..95;136..145;207..217 


347 


2..10;26..32;52..68;75..86 


348 


71..77;82..89;114..125 


349 


54..66;70..76;294..302 


350 


16..25;31.,37;69..80 


351 


68..73; 1 1 8.. 1 29; 1 35 1 48 


352 


68..73;118..129;135..148 


353 


10..18;61..69;68..77 


354 


13..20;44..51;57..85;94..102;143..151 


355 


18..30;38..43;43..54 


356 


1 ..6;40..45; 120.. 1 26; 129.. 1 54; 153 .. 1 62 


357 


49..59;133..141;152..167 


358 


9..26;80..85;96..102 | 


359 


98..104;181..187 j 


360 


1..6;44..54;113..123;147..161 


361 


64..74;77..90;112..141 


| 362 


20..30;37..42;53..61;74..83;110..119;I25..134;138..147 


363 


1..11;26..31;53..60;97..105;110..117;I41..146 


364 


16..24 


! 365 


7.. 15 
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1/^7 


Q 1 A 
O.. 14 


IAS 

JOS 


i O..ZJ 


1 


^ 1 A ■ 7 1 -74 £1 


170 


1Q d^ A^ fs.1-9.1 01 


171 


j /..4o,o i ..cso,vz..yy, i^u.. i jo, ioj.. ioy, i / v.. i o4,zuy..z io,Z4Z.. 

757-7S8 768 


I! 177 


77 81-01 104-141 146 


171 


71 77-86 01 


174 


1 7-11 47-47 57-50 65 


175 


s 10-55 60-81 oo 


176 


14 41-40 46-67 81 


377 


27..38;49..56;54..64;68..78;111..122 


17fi 

-WO 


0 7^80 85-06 107 

7..zo,ou..o j,yo. , iuz 


17Q 


41 5057 ^8-10^ 704*118 177-117 147 
4 I .,OU,v5Z-.3o, 1 V J - .ZU4, J IO..jZ/,jj / . . J4 / 


1 GO 


08 1 Ol 
70.. 1 \)j 


381 


46..51 


382 


67..72 


383 


4..12;25..33;73..81 


384 


42..54;69..77; 1 1 7.. 1 27; 1 25.. 1 4 1 


385 


I05..ll2;l46..l55;18l..l87;200..2i4;230..243;254„264;261..2 
72;290..295 


386 


46..53;92..100; 107.. 1 14; 1 1 6.. 123; 1 26.. 1 32; 1 76.. 1 82 


387 


5..12;44..50;158..167;190..195 


388 


16..25;31..37;73..79;96..101;105..110;158..166;215..224;240..2 
47;273..280;299..306;321..334;347..352;370..375 


389 


28..35;72..81;93..100;101..1 13; 183.. 192;218..228;284..300;3 18 

TlA.llfl 1AA.AA1 A Z A , A O t AQC\.AC\1 CAA 

..J JU,JJv..J44,44j..4d4;4o 1 ,.4ov;4v I .._>(J0 


ion 


■j Q. lO /I 0 

i ..5,jy ,.4o 


1Q 1 


1 O/l 1 1/1.1 ^ -7 1 (L") 

1U4.. 1 14, 1 j iOZ 


392 


28..34;31..36;46..55;77..85 




1 A /I 1 1 A . 1 O O T1/I 

1U4.. 1 14; I OO..ZZ4 


TO/I 


/I /I C 1 . 1 f\1 11/1 

44..M;1U7..1 14 i 


395 


2..15;51..61;82..88;104..114;249..259;286..298;333..340;361..3 

0 / 


396 


24..3 1 ;54..65;79..85;92..99; 1 80.. 1 86;2 16..22 1 


'JOT 

397 


20..33;3 1 ..41 ;67.. 75; 82. .89; 168. .173 


398 


58..65;93..101;135..143;198..203 


399 


1 1..17;37..48;71..79;94..100;99..1 12; 132. .144; 1 61. .173; 173.. 18 


400 


21..31;65..84;91..99 


401 


1..9;1 1..27;78..83;88..96;107..1 12;1 12.. 121; 135.. 141; 147.. 153; 
164.. 170 


402 


!..9;1 1..35;83..94;106..1 1 1;1 1 1.. 120; 134. .140; 146. .152; 163.. 16 

Q 

y 


403 


2..15;51..61;82..88;104„U4;249,.259;286..298;333..340;361..3 
67 


405 


104..1 14;188..224 


406 


1..7;61.71;77..83;80..85;163..179;204..211;210..225;231..238; 
254..259;272..278;305..32l;347..356 


408 


53..77;120..130;144..159;159..169;196..202;266..272;331..345 


409 


1..9;73..81;226..236 
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410 


12..21;37..49;78,.85 


411 


22..36;151..156;161..170 


412 


56..65;136..146;179..186;192..207;227..236 


413 


8..14;92..99;184..191 


414 


1..10;16..30;71..77;95..105;102..1i3;123..130;137..145;165..17 




1;199..205;217..226;286.,295;309..314;360..376;384..389 


415 


19. .34; 100.. 107; 1 15. .123; 143. .149; 154. .164; 168. .175; 176. .189; 




217..226;224..239;249..257;264..270;278..290;294..303;328,.3 




36;347..356;374..384;391..396;444..449;453..460;476..48 1 


A 1 A 
*t 1 O 


10S 117-147 1SS-181 714-710 741-7S4 7A4-7A1 7 
IvJ,. 1 1Z, It/.-I JJ,lol..lo / ,ZUU..Z 1 *4,Z J V. .ZHJ ,ZJ4.,Z04,Z0 1 ..Z 




77-700 70S 


417 


8 14 


41 8 


8 1407 00-777 71fv7A0 771-701 100 


419 


8..16;1 17..122 1 


/I OA 

4zU 


z / ..J 5, / 1 ..o3;9 1 ,.y /, 1 3 / .. 14o 


/I O 1 

4z 1 


O 11. *7"7 Ol 

z.. 1 1 ; / /..oz 


/loo 
4ZZ 


ZU..zo,o 1 ..OV 


423 


n in.Tc /ii /ca.co /cc.oo m, 1 a/; i it,n/; i i /i 

9.. 1 9, ZD. ,34,4 /..oUp /..oj;o/..Vz, lUo.. 1 lo; lzo.. 1 34 


yl O /I 

4Z4 


/C 10,11 /Co .on 0 /c. 0 a ac . 1 ao 111.100 1 ac\ 
0.. 1 o;z 1 ..34,j3..oz; /y..oo;8y..V:>; 10o.. 1 1 3; 128.. 1 49 


/IOC 

I 4zj 


O 11 -OO /liC/IO CC 

z.. 1 I ,zV..46;4/..jj 


4zo 


c O^.. 1 1 

j..zo;33..oo 


A OO 

4z / 


1 1 I.IO IO.IO /1A.CO /CA-OA AQ.1AA IAC.IOA IIO.ICC 1 £A 

I ..1 1 ; ly..3o;3o\.4V; jz..ou;9u..yo; 10U.. 10j;1z9.. 13 /;! jj..160 


478 


71 11*47 AW-iS 77-87 04 
ZJ , . j 1,4/..01 ,OJ .. /Z,o / ..yn- 




1 17-11 10-S0 AO-87 87 


410 


81 01 


41 1 
*+J 1 


1A 44*81 80 


417 


78 47-SA 7A- 110 117 


411 1 


S 14-41 40 


434 


9..17;15..21;61..70;80..89 


43j 


1 O/1-IO /1A.CO <A 

I ..Z4,3z..4U; JZ..0U 


/i i/c 
43o 


1 1 1 • 1 O 1 A. OA A "1 

1 1 1 ; l /..3U;zV..43 


/1 1"7 


on ^Aoc ic./io co 
ZU..3U,zo..3 j,4U..7U 


A1Q 

4 Jo 


1A OQ-7/i oc.qi on- 1 m 1 lO 


/IT A 

43V 


c a qo- i (\a i i n 
OO..VZ, 1U4.. 1 1U 


/l AC\ 

44 U 


i/c a^'Ah co./ci "7 1 • i n/^ ioi 
3o..4Z,4 / ..j /,OJ.. / 1 , 1 Uo.. 1 z 1 


44 1 


lO 0C..O7 


AA~) 
44Z 


1 1/1-0 1 OO-IA A1-AA C. 1 -^O G1-1A1 1AQ.1 1 1 110-118 1/1Q 
1 .. 1 4,z i ,.ZV,30..4Z,44..J I ,oZ. .o 1 , 1 \) 1 . . 1UV, 1 1 1 .. 1 1 V, 1 3o.. 14V 


441 


10 18-7S 11 11 40-S1 70-80 04 


444 


1 8-10 7A-17 44 


44 S 


1 11-10 18-18 40-S7 AO- 1 10 110 


446 


12 20 28 37 43 66*90 102 


447 


15..20;24..31;36..47;68..82;88..96 1 


i 448 


29..45;83..91;88..94;132..144 


! 449 


22..33;54..64;86..96;102..108 | 


450 


27..39;47..60;101.,107;155..164;270..281;287..300;306..312;32 




7..332 


451 


13..24;60..70;77..83 


452 


8..14;74..82 | 


453 


6..12;63..78;77..91;97..103;102..108 | 
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454 


32..44;66..72;101..114;166..174;209..235;243..252;258..263 


455 


69.. 76; 13 1.. 139; 164.. 173 


456 


54..63;95..103;187..202;21 1..216;249..261 


457 


14..21;3I..45;80..88;187..194;347..353 


458 


47..62;79..86 | 


459 


1..8;27..37;90..97;99..106;123..I40;145..163 


! 460 


8..I7;35..45;131..139;162..169;175..180 


461 


1..6;13..23;58..66;89..101 


462 


44..53;86..93 


463 


62. .70 


464 


50..57;59..69;67..73;79..95 


| 465 


10..17;23..44 


466 


3..i5;71..78;110..121;125..13l;259..269;296..306;312..318;340 
..346;353..363;370..379;407..412;417..425;448..453;483..493 


467 


5..12;20..30;70..78;82..100;106..I15;129..135 


468 


8..16;22..31;36..45;75..84 


469 


14..23;98..105;106..116 


470 


1L.23;26..31;54..62;101..107 j 


471 


23..29;66..81 


472 


23..29;93..100 1 


473 


8..25;79..89;103..I09 


474 


37..45;80..89;94..101; 125.. 130 


475 


37..45;80..89;94..101;125..130 


476 


7..26;23..36;36..45;78..83;80..85 


[ 477 


45. .53 


478 


1..7;16..22;78..93;96..102 


479 


24..33;41..50;61..80;93..100;129..136;160..170;199..208;267..2 
76;325..335 


480 


5..14;43..51;102..116 


481 


2..15;16..24;53..62;87..97;100..109;109..133;145..152 


482 


2..15;16..24;53..62;87..97;100..109;109..133;145..152;168..176 
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Table VIII 



i Seq Id No 


Chromosomal location 


2 


16p11-p13 


12 


16p11-p13 


22 


7q35-q36 


25 


chr.19 


34 


chr.17 


35 


6p21.3 


40 


chr.20 


42 


12p13.3 


45 


12p13.3 


51 


12p 


56 


22q11.2-q13.2 


57 


12p13 


60 


chr.10 


62 


chr.17 


65 


Xq13 


67 


chr.14 


70 


chr.7(1);7q11.23-q21.1(1) 


71 


chr.7(1);7q11.23-q21.1(1) 


73 


6p21.3 


74 


6p21.3 


87 


19q13.1 


88 


7q21-q22 


94 


17q11.2 


99 


6q21 


101 


6p11.2-p21.3 


103 


chr.17 


106 


6q15-q16.3 


107 


16p13.3 


108 


12q 


113 


1p33-p34.3 | 


125 


6p22.1-p22.3 


126 


16p13.3 


127 


14q11.2 


135 


22q11.2-q13.2 | 


138 


chr.3 


141 


12q24.1 


146 


3p21.3 


147 


chr.2 


149 


chr.17 


150 


21q 


152 


21q 


154 


20q12-q13.11 i 


155 


11p15.5 


160 


19q13.2 


161 


19q13.2 


162 


20q12-q13.11 


164 


21q 


166 


21q 
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170 


6p12.1-p2L1 


172 


21q 


173 


chr.19 


176 


21q 


177 


21q 


179 


chr.6 


183 


chr.7 


185 


Xq21.3-q22.3 


186 


chr.20 


192 


11q12.2 


195 


chr.20 


196 


20q13.1-q13.2 


197 


7p15-p21 


198 


19q13.3 


199 


chr.2 


201 


Xq22.1-q23 


202 


Xq22.1-q23 


204 


chr.20 


205 


chr.5 


I 206 


chr.2 


208 


chr.5 


214 


chr.12 


220 


Xq28 


224 


chr.7 


227 


chr.14 


230 


chr.7 


231 


chr.7 


238 


19p13.3 
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Table IX 


Seq Id No 


Tissue distribution 


1 


Br:28;FB:25;FK:9;Ov:17;Pl:12;Pr:4;SC:2;SG:4;Te:9 


2 


Br:2;CP: 1 ;FB.5;FK: 1 ;Pl:3;Pr: 10;SG: 1 


3 


Br:l;CP:l;FB:33;FK:13;Li:2;Ov:19;PG:12;Pl:27;Pr:15;SG:9;SI:12 


4 


AG:l;CP:l;LG:l;Pr:3;Te:l 


5 


Pa:4;Pr;2 


6 


Li:l;Pa:4;Pr:3 


7 


Br:9;Pr:l;Te:3 


8 


Br:4;FB:l;Pr:3;SG:8 


9 


Br:4;Ce: 1 ;Co: 1 ;DM:4;FB:33;FK: 16;He:3;Ki:6;LC:2;LG:4;Li:2;Lu:2;Ly: l;Ov:36;Pa 
: 1 6;Pl:2;Pr:4;SC:2;SI: 1 ;SN: 1 ;Sp: 1 ;UC:3;Ut: 1 


10 


Br:l;CP:l;Pr:4;SG:2 


11 


Pr:2;SG:4 


12 


Br: 1 ;CP: 1 ;FB:5;FK: 1 ;PI:3;Pr:9;SG: 1 


13 


FL:4;Li:4 


14 


Li:4;Te:3 


15 


Te:l 


16 


Li:3;Te:6 


17 


Ce:l;FB:6;Li:l;Pl:5;Te:16 


18 


Li:7;Te:6 


19 


Li:27;Te:9 


20 


Li:l;Te:3 


21 


Te:3 


22 


Te:3 


23 


Li:l;Te:6 


24 


Li:2;Te:2 


25 


Te:8 


26 


Te:5 


27 


LC:l;Te:2 j 


28 


Li:l;Te:2 


29 


AG:2;BM: 1 ;Br: 1 6;CP: 1 ;Co:2;DM: 1 ;FB:45;FK:62;FL: 1 ;HP:3;LC: 1 ;Li:2;Mu: 1 ;Ov:2 
;Pr: 10;SI:5;SN:3;Te:9;UC: 1 


30 


Li:2 


31 


Br:3;CP: 1 ;FB: 1 ;FK:6;Pr: 1 ;Te:2 


32 


Br:l;CP:l;Ce:6;Ov:l;Te:2 


33 


FK:5;SC:1 


34 


Br:l;FB:2;FK:48;Pl:2;SN:l 


35 


Te:l 


36 


FB:5;Pr:l;SN:l 


37 


FB:3;FK:l;Li:l;SG:5 


38 


FB:10 


39 


FB:3 


40 


Br:l;DM:l;FL:l;Pl:4;SG:13 


41 


FB:3;FK:l;Li:l;SG:5 
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42 


BM:1;SG:19 


43 


SG:1 


44 


CP: 1 ;FB: 1 ;Mu:2;Pl:9;SG:7 


45 


BM:1;SG:20 


46 


BM: 1 ;DM: 1 ;FB:5;FK:6;FL: 1 ;He: 1 ;Ki:2;Ov;9;Pl: 1 ;SG: 1 ;SI: 1 ;Te: 1 


47 


Br:4;FB:4;Pr:3;SG:8 


48 


Br:12;Ce:l;Co:l;FB:5;FK:4;FL:5;HP:l;Ki: 1;LC: l;Li:6;Ov:8;Pl: 105;SC: 1 ;SG:8;Te: 
4 


49 


Br:7;Ce: 1 ;Co: 1 ;FK:4;HP: 1 ;Ki: 1 ;LC: 1 ;Li:5;Ov:8;Pl:5;SC: 1 ;Te: 1 


50 


AG:l;CP:4;Ce:l;DM:2;FB:6;FK:4;FL:2;HP:2;LC:l;LG:3;Li:31;Lu:3;Mu:l;Ov:25;P 
1: 1 5;Pr:20;SC: l;Te:75;UC:5;Ut: 1 


51 


FL:1 


52 


Br.2;CP: 1 ;FB:3;FK: 1 ;FL;5;LC:2;P1: i ;Pr:2;UC:2 


53 


Br:3;FK:4;FL:4;HP: 1 ;Li:3;PI: 1 1 ;SG: 1 ;Te: 1 


54 


Br: 1 5;Ce: 1 ;FB: 10;FK: 10;FL: 1 ;He: 1 ;Ki:6;LC: 1 ;Li:4;Ov:32;Pa:3;Pl:2;Pr:4;SC: 1 ;SN: 
2;Sp:4;Te:8;UC:l;Ut:l 


55 


FL:2 


56 


Br:l;FB:l;FL:l;Te:l 


57 


FL:4 


58 


FL:l;Li:l 


59 


FB:3;FK:l;Li:l;SG:5 


60 


Br:l;FB:l;FL:l;Pr:2 


61 


Br:2;Pl:l 


62 


Br:6;CP: 1 ;Ce:7;FB:37;FK:4;FL: 1 ;Pl:6;Pr: 1 ;SG:3;SN:3;Te: 1 ;UC: 1 


63 


Br: 10 ! 


64 


Br:2;CP:2 


65 


Br:l;FB:l l;LG:l;Th:l 


66 


Br:30;Ce: 1 ;Co: 1 ;FB:60;FK: 1 5;FL:3;HP: 1 ;Ki: 1 ;LC: 1 ;Li:6;Ov:57;PG:9;Pl: 145;Pr:2 1 ; 
SC:l;SI:4;Te:4 


67 


Br:4;CP: 1 ;FB: 14;Ki: 1 ;Li: 1 ;Lu:2;Pr: 1 ;Te: 1 


68 


Br: 10 


69 


AG: 1 ;Br:48;FB:3;FK:5;HP: 1 ;He: 1 ;Li: 1 ;P1: 1 1 ;SC:2;SG: 1 ;Te:2;Ut: 1 


70 


Br:ll;DM:l;He:l 


71 


DM:l;He:l 


72 


Br:9;Pr:l;Te:2 


73 


Br:8;Pr:l 


74 


Br:5 


76 


AG: 1 ;Br:49;FB;4;FK:5;HP: 1 ;He: I ;LC: 1 ;Li: 1 ;P1: 1 1 ;SC:2;SG: 1 ;Te:2;Ut: 1 


77 


Br:2;FK:2;HP: 1;LC: 1 ;Li:2;Ov: 14;P1: 1 ;Pr: 14;Te:5 


78 


Br:9;Ce: 1 ;DM: 1 ;FB:2 1 ;FK: 1 8;FL: 1 ;HP: 1 ;He: l;Ki:9;LC:2;LG:4;Li:2;Lu:2;Ov:34;Pl 
:3;Pr:4;SC: 1 ;SI:2;SN:2;Sp: 1 ;Ut: 1 


79 


Pr:l 


80 


Br:9;CP:2;Co: 1 ;DM:6;FB: 1 ;FK:6;He:2;Ki:4;LC: 1 ;LG: 1 ;Li:7;Ov:40;Pa: 1 ;Pl:2;Pr: 1 ; 
SN:2;Sp:l;Te:12;UC:l;Ut:3 


81 


FK:l;Te:l 


82 


Li:l 



528 



WO 01/42451 



PCT/I BOO/01 938 



83 


Br:2;CP: 1 ;FB: 10;FK:2;Ki:3;Li:7;Ov: 10;SC: 1 ;SN: 1 ;Te: 1 ;UC: 1 


84 


Br:5;FB:14;FK:9;Li:6;Ov:17;SG:8;Te:8 


85 


Li:6;Te:2 


86 


Li:2;Te:2 


87 


Br:l;FB:35;FK:31;Li:20;Ov:37;PG:5;Pl:69;SI:5;Te:5 


88 


Li:l;Pr:l;Te:7;Ut:2 


89 


Te:l 


90 


Te:2 


91 


FB: 1 5;FK:3;Li:2;Ov: 1 7;Pr:4;SG:7;Te:4 


92 


Te:2 


93 


Br:4;FB:l;SN;l;Te:2 


94 


Te:l 


95 


Li:2 


96 


AG:l;Br:l;FB:l 


97 


FK:5;Te:2 


98 


Te:3 


99 


Br:3;FB:29;FK: 1 ;Li: 1 0;Ov: 1 ;P1: 1 6;Pr:2;SG: 1 ;Te:49 


100 


Br:2;FB:3;FK:l;Ov:3;Te:l 


101 


Br:10;FB:34;FK:l;Ov:l;Pl:85;Pr:l;Ut:l 


102 


FB:6 


103 


FB;6;Li:3;Pl:l;Pr:l;SG:l;Te:7 


104 


Br:26;CP: 1 ;FB: 8;FK: 1 3 1 ;Pl:20;Pr:20 


105 


Br:3;CP:2;DM:2;FB: 1 1 ;FK:3;LG:2;Ov: 1;P1:6;SC:2;SG: 1 ;SN:4 


106 


FB:4 


107 


Br:3;FB:50;FK:59;FL:3;Pr: 1 ! 


108 


Br:l;FB:8;Li:l;Lu:l 


109 


FB:ll;Pr:21 


110 


Br:14;Ce:l;FB:5;FK:5;FL:l;HP:l;He:2;Ki:3;LC:l;Li:4;Lu:l;Ov:7;Pl:2;Pr:l;SI:l;Sp 
:1;UC:1 


111 


Br: 1 ;Ce: 1 ;FK: 1 ;HP: l;He:2;Ki:3;LC: l;Li:3;Lu: 1 ;Ov:7;Sp: 1;UC: 1 


112 


Br: 1 ;HP: 1 ;Lu: 1 ;Pr: 1 ;SG:9;Ut: 1 


113 


HP:1;SG:4 


114 


FK:9 


115 


AG:l;Br:3;CP:l;FB:14;FK:19;FL:l;HP:l;Pr:l;SG:l 


116 


Br:5;CP: l;Ce: 1 ;Co: 1 ;DM:5;FK:3;FL: 1 ;LC:3;LG: 1 ;Lu: 1 ;Ov:23;Pl: l;Te:8;UC:2;Ut:4 


117 


Br: 1 ;Ce: 1 ;FB: 1 ;FK: 1 ;FL:2;P1:3;SN: 1 ;Te: 1;UC: 1 


118 


CP:l;DM:l;FB:5;FK:2;FL:2;He:l;Lu:l;Ly:l;Ov:23;Pr:l;SN:2;Sp:2;Ut.i 


119 


Li:2;Te:7 


120 


Br:6;Co:2;FB: 1;FK:6;FL: 1 ;Ov:3;Pl:32;Pr: 1;SN: 1 


121 


AG:l;Br:4 


122 


Br:5 


123 


Br:l 


124 


Br:2;Ki:2;Li:l;Ov:7;UC:l 


125 


Br:2;FB: 1 ;FL:6;He: 1 ;Li: 1 ;Ov: 1 ;Pl:2;Pr: 1 0;Te: 1 ;Th: 1 


126 


Br:l 
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127 


BM: 1 ;Br:2;CP:2;FB: 1 ;FK:3;HP: 1 ;He: 1 ;LG: 1 ;P1: 1 ;Pr: 1 ;SC:2;SG:2;Te:5;Ut:3 


128 


Br:l 


129 


Br:2;FB:6;Li: 1 ;SG:3;Te:2 


130 


Br:25;FB:3;FL:2 


131 


Br:l 


132 


Br:l 


133 


Br:l 


134 


Br:2;SN:l 


135 


Br:l 


136 


AG- 1 -Br* 1 FL' 1 

i \ V_J , 1 1 J s 1,1,1 1.^ , 1 


137 


Br- 1 Ce* 1 FB* 1 'FK- 1 , FL'2'P1*3 SN1 Te* i'UC' 1 


138 




139 


Br- 1 1 -rP'2-Co* 1 *DM*6*FB- 1 FK CrHe J'Ki'A'l C- 1 I G- 1 Ov ^O'Pa- 1 'Pl-2-Pr 1 SN- 
2;Sp:l;Te:9;UC:l;Ut:3 


140 


Br:23;Ce:l;DM:3;FB:38;FK:17;FL:2;HP:l;He:l;Ki:8;LC:3;LG:2;Li:6;Lu:l;Ly:l;0 
v:40;Pr:4;SC:2;SN:4;Sp:\;Te:5;UC:l';Ut:r 


141 


Br:39;FB:3;SN:2 


142 


Br:10;SN:2 


143 


Br:26;FK:2;HP: 1 ;LC: 1 ;Li:2;Ov: 14;PI:3;Pr:3;Te:5 


144 


Br:14;Pr:2 


145 


FB: 1 2;LG: 1 ;Pr:4;Te: 1 ;Ut:2 


146 


Li:l;Ov:2;Pr:5;SG:l 1 


147 


Li:l;Te:l 


148 


Br:l;FB:l;Li:l;Te:l 


149 


Br:3;FB:5;FK:5;Li:l;Pl:8;Te:5 


150 


FK:6;Pr:2;SG:8 


151 


FK:9 


152 


FK:6;Pr:2;SG:9 


153 


Te:l 


154 


FB:28;Ov:4 


155 


Br:21;Ce:l;FB:32;FK:4 


156 


Br:5;CP:l;FB:16;FK:3;He:l;Ki:5;Li:l;Ov:15;Pl:3;SG:2;SI:l;Sp:l;UC:t 


157 


FB:14;FK:1;FL:1;SG:1 


158 


FB:7 


159 


FB:10 


160 


Ce:2;FB:12 


161 


Ce:2 


162 


FB:28;Ov:2 


163 


FB:14;FK:1;FL:1;SG:1 


164 


FK:4;Pr:l;SG:9 


| 165 


Br:4;Co: 1 ;Ki: 1 ;Ov:2;Pr:4;SG: 1 


166 


FK:6;Pr:2;SG;9 


167 


Br:l;FB;l;SG:5 


168 


Br:l;FB:5;FK:7;SG:l;UC:l 


169 


FK:2 


170 


FL:12 
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171 


Br:2;FB:l;FK:l;Pl:7 


172 


Br:106;FB:2;Pl:7 


173 


Br:14;FB:l;Pl:2;Te:l 


174 


Br:17;He:l;Pl:l;SC:2;Te:l 


175 


Br:14;Pr:2 


176 


Br:106;FB:2;Pl:7 


177 


Br:114;FB:7;FK:7;Ov:2;Pl:7;Pr:2;Te:9 


178 


Br:16;CP:2;FB:2;FK:2;FL:l;Li:l;Pl:13;Pr:3;SC:l;Ut:l 


179 


FL:l;HP:2;Pr:2;Te:l 


180 


Pr:2 


; 181 


FB:l;Ov:2;Pr:l;UC:l 


182 


BM: 1 ;Br:4;DM: 1 ;FB:6;FK:6;Ki:5;LC:2;LG: 1 ;Li: 1 ;Lu; 1 ;Ov: 1 5;P1: 1 ;Pr:2;SC: 1 ;Sp:2; 
Te:2;Ut:l 


183 


Br:8;CP: 1 ;Co:2;DM:4;FB: 1 ;FK: 1 ;Ki:4;LC: 1 ;Li;3;Ov:33 ;P1: l;Pr:5;SC:2;SN: 1 ;Sp: 1 ; 
Te:5;UC:l;Ut:2 


184 


Pr:l 


185 


FB:2;Li:l;Ov:l;SG:7;Te:5 


186 


Te:3 


187 


Te:l 


188 


Brl8'CP' l*DM*5'FB-40'FK-23*FL*2 , He'3-Ki - 10 LC'2-LG- 1-1 rI3I u'3-Lv2 Mu* 1 
;Ov:54;Pl:5;Pr:14;SC:2;SG:2;SI:2;SN:4;Sp:3;Te:4;UC:4 


189 


Li:l;Tc:l 


190 


Br:7;CP: 1 ;FB: 1 ;FK:4;FL:5;He: 1 ;Li: 1 ;Ov: 1 ;P1 :2;Pr:4;SG: 1 


191 


Li:2;Te:4 


192 


AG: 1 ;Br:2;CP: 1 ;FB:32;FK: 1 ;Li: 1 ;Ov:36;Pl:49;Pr:3;SC: 1 ;SG:4;SN:4;Te:9;UC: 1 ;Ut: 
2 ! 


193 


FB:31;FK:75;FL:7;Ov:12;Pl:23;Pr:8;SG:3;Te:16 


194 


Te:2 


195 


Te:7 


! 196 


Te:2 


197 


Te:3 


198 


Li:10;Te:43 


j 199 


Br:35;CP:3;FB:39;FK:56;FL:7;HP:l;LG:l;Li:l;Ly;l;Ov:2;Pl:10;Pr:8;SG:l;Te:4;Ut: 
2 


200 


FB: 1 7;FK:9;FL:5;Ov:2 1 ;P1:4 1 ;Te:3 


201 


FK:16;SI:1 


202 


Br: 1 ;Co: 1;FB: 1 1 l;FK:25;He: l;Li:4;Ov:3;Pr:6;Te: 1 


204 


Te:7 


205 


Li:7;Te:28 


206 


FB:28;Li:2;Ov:23;PG: 1 1 ;P1:45;SG: 1 7;SI: 1 1 ;Te:9 


207 


FB:16;FK:l;Ov:l;SC:l;Te:l 


208 


FB:5 


209 


FB:6 


| 210 


Br:l;FB:22 


| 211 


Br:2;Ce:3;FB:6;FK:l 


212 


Br:l;Co:2;FB:22;FK:2;LG:2;Mu:2;PI:2;SG:4 
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213 


Br:2;DM: 1 ;FB:8;FK:8;FL: 1 ;Ki: l;LG:3;Ov:5;Pa: 1 ;Pl:4;Pr: 1 ;SN:2;UC; 1 


214 


FB:7 


215 


FB:4 


216 


Ov:3;SG:3 


217 


Br:4;CP:2;DM:l;FB:9;FK:3;Ki:2;LC:l;LG;l;Lu:3;Ly:l;Ov:14;Pl:l;Pr:l;SC:l;SG:2 
;Sp:l;Te:l;Ut:l 


218 


FB:4;FK:2;Pl:l;Pr:ll;SG:l 


219 


Br:7;CP:3;FB:2;FL: 1 ;HP:4;Lu: 1 ;Ly:2;Mu: 1 ;Ov:3;Pl: 1 ;Pr: 1 ;SN:2;Te: 1 


220 


Br:l;FL:l;Pl:2 


221 


Co:l;FB:2;FL:l;Li:l;Pl:2 


222 


FL:1;SG:2 


223 


Li:l;Te:l 


225 


Li: 10 


226 


Li:l;Te:4 


227 


Li; 1 


228 


Br:l 


229 


Br:3 


230 


Br:5;Ce: 1 ;Co: 1 ;DM:3;FB: 1 ;FK: 1 ;He: 1 ;LC: 1 ;LG:2;Ov: 1 6;Pl:3;Pr: 1 ;Te:2;Ut: 1 


231 


Br:3;Ce: 1 ;Co: 1 ;DM:3;FB: 1 ;FK: 1 ;He: 1 ;LC: 1 ;LG:2;Ov: 1 6;Pl:3;Pr: 1 ;Te:2;Ut: 1 


232 


AG l Br:17;CP , 2 DM l'FB'51FK-9 FL-3*Li , 3 Ov-3 Pl-2 Pr l0 SCM SG-5 , Te-2 Ut* 

1 


233 


Br: 13 


234 


Br:5 


235 


Br:l;Pl:l 


236 


Br:9 


237 


Br:22;DM:2;FB: 1 7;FK:9;Ki:4;LG: 1 ;Li: 1 ;Lu:2;Ov:24;Pr:3;SC: 1 ;SI: 1 ;SN:2;Te:2 


238 


Br:17 1 


239 


Br: 1 1 


240 


Br:28;Ce: 1 ;DM:5;FB:52;FK:40;FL:2;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li: 1 ;Ly; 1 ;Ov:28; 
PI: 1 ;Pr:5;SC: 1 ;SI: 1 ;SN:3;Sp:6;Te: 1 ;UC: 1 ;Ut: 1 


241 


Br:4;Ce: 1 ;DM:5;FB:5;FK:7;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li: 1 ;Ly: 1 ;Ov:28;Pl: 1 ;SC: 1 ; 
SN:3;Sp:6;Te:l;UC:l;Ut:l 
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Table X 



Seq Id No 


Low frequency 
expression 


High frequency 
expression 


1 




Br,Ov 


2 




Pr 


3 


Br,Te 


Ov,PG,Pl,SI 


4 




AG 


5 




Pa 


6 


* 


i Pa 


7 




Br 


8 




SG 


9 


Br,Te 


DM,He,Ki,Ov,Pa 


10 




Pr 


1 1 




SG 


12 




Pr 


13 


- 


FL,Li 


14 




Li,Te 


15 


- 


Te 


16 




Li,Te 


17 


- 


Te 


18 


- 


Li,Te 


19 




Li,Te 


20 




Te 


21 


- 


Te 


22 




Te 


23 




Te i 


24 


- 


Li 


25 


■ 


Te 


26 




Te 


27 




LC,Te 


28 




Te 


29 


PI 


FK 


30 




Li 


3 1 




FK 


32 




Ce 








34 


FB 


FK 


35 




Te 


36 




SN 


37 




SG 


38 




FB 


40 




SG i 


41 




SG 


42 




BM,SG 


43 




SG 
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44 


- 


Mu,Pl,SG 


45 


- 


BM,SG 


46 


- 


BM,Ki,Ov 


47 


- 


SG 


48 


FB,FK,Pr 


PI i 


49 


- 


Ki,Ov 


50 


Br,FB,FK,SG 


Li,Ov,Te 


51 




FL 


52 


- 


FL.LQUC 


53 


- 


PI 


54 


- 


Ki,Ov,Pa,Sp 


55 


- 


FL 


57 


- 


FL 


58 




FL 


59 


- 


SG 


62 


- 


Ce,FB 


63 


- 


Br 


64 


- 


CP 


65 


- 


FB,Th 


66 


FK,SG,Te 


Ov,PG,Pl 


67 


- 


FB,Ki,Lu 


68 


- 


Br 


69 


FB 


Br 


70 


- 


Br,DM,He 


71 


- 


DM, He 


72 


- 


Br 


73 


- 


Br 


74 


- 


Br 


75 


- 


Br 


76 


FB 


Br 


77 


FB 


Ov,Pr 


78 


- 


Ki,Ov 


80 


FB 


DM,Ki,Ov 


82 


- 


Li 


83 


- 


Ki,Li,Ov ! 


84 


- 


Ov 


85 


- 


Li 


86 


- 


Li 


87 


Br,Pr,SG 


Ov,PG,Pl 


88 




Te,Ut 


89 




Te 


90 




Te 


91 




Ov 


92 




Te 


93 




SN 


94 




Te 
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95 


- 


Li 


96 


- 


AG 


97 


- 


FK 


98 


- 


Te 


99 


FK 


Te 


100 


- 


Ov 


101 


FK 


PI 


102 


- 


FB 


103 


- 


Te 


104 


FB,Li,SG,Te 


FK 


105 


- 


DM,SN 


106 


- 


FB 


107 


Br,Pl 


FB,FK 


108 


- 


FB,Lu | 


I 109 


- 


Pr 


110 


- 


He,Ki,Ov 


111 


- 


Ce,He,Ki,Lu,Ov 


112 


- 


Lu,SG 


113 


- 


HP,SG 


114 


- 


FK 


115 


- 


FK 


116 


FB 


DM,LC,Ov,Ut 


117 1 


- 


Ce,UC 


118 


- 


Ov,Sp 


119 


- 


Te 


120 


FB 


Co,Pl 


121 


- 


AG, Br 1 


122 


- 


Br 


124 


- 


Ki,Ov 


125 


- 


FL,Pr,Th 


127 


- 


BM,SC,Ut 


130 


- 


Br 


134 


- 


SN 


136 


- 


AG 


137 


- 


Ce,UC 


138 


FB 


Br 


139 


FB 


DM,Ki,Ov,Ut 


140 


PI 


Ki,Ov 


141 


- 


Br 


142 




Br,SN 


143 


FB 


Br,Ov 


144 




Br 


145 




FB,Ut 


146 




SG 


149 




PI 


150 




FK,SG 
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151 


- 


FK 


152 


- 


FK,SG j 


153 


- 


Te 


154 


- 


FB,Ov 


155 


- 


Br,FB 


156 


- 


Ki,Ov 


157 


- 


FB 


158 


- 


FB 


159 


- 


FB 


160 


- 


Ce,FB 


161 




Ce 


162 


- 


FB 


163 


- 


FB 


| 164 


- 


SG 


165 


- 


Co,Ki,Ov 


166 


- 


FK,SG 


167 


- 


SG 


168 


- 


FK 


169 


- 


FK 


170 


- 


FL 


171 


- 


PI 


172 


FB,FK,Pr 


Br 


173 


- 


Br 


174 


- 


Br,He,SC 


175 


- 


Br 


176 


FB,FK,Pr 


Br 


177 


FB 


Br 


178 


- 


Br,Pl 


179 


- 


HP 


180 


- 


Pr | 


181 


- 


Ov,UC 


182 | 


- 


Ki,Ov,Sp 


183 


FB 


DM,Ki,Ov 


185 


- 


SG/Te 


186 


- 


Te 


187 


- 


Te 


188 


PI 


DM,Ki,Ov 


190 


- 


FL | 


191 


- 


Te 


192 


Br,FK 


Ov,Pl 


193 


Br 


FK,Ov 


194 




Te 


195 




Te 


196 




Te 


197 




Te 


198 


FB 


Li,Te 
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I 199 


- 


FK 


200 


Br 


Ov,Pl 


201 


- 


FK 


202 


- 


FK 


203 


Br,Pl 


FB 


204 


- 


Te 


205 


- 


Li,Te | 


206 


Br,FK,Pr 


Ov,PG,Pl,SG,SI 


207 


- 


FB 


208 


- 


FB 


209 


- 


FB | 


210 


- 


FB 


211 


- 


Ce 


212 


- 


Co,FB,Mu 


213 


- 


Ki,LG,Ov 


214 


- 


FB 


215 


- 


FB 


216 


- 


Ov,SG 


217 


- 


Ki,Lu,Ov i 


218 


- 


Pr 


219 


- 


CP,HP,Ly,Ov,SN 


221 


- 


Co 


222 




SG 


223 


- 


SG 


225 


- 


Li 


226 


- 


Te 


227 


- 


Li j 


229 


- 


Br 


230 


- 


DM,Ov 


231 


- 


DM,Ov 


232 


- 


FB 


233 


- 


Br 


234 




Br 


236 




Br 


237 




Ki.Lu.Ov 


238 




Br 


239 




Br j 


240 


Pl,Te 


DM,FK,Ki,Ov,Sp [ 


241 


FB 


DM,He,Ki,Ov,Sp j 
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Table XI 



Seq Id No 


1 1 1 1 ■ I" a« 

Subcellular localization 


■7 

; 7 


nuclear 


13 


extracellular, including cell wall 


20 


mitochondrial 


21 


nuclear 


26 


nuclear 


35 


nuclear 


37 


endoplasmic reticulum 


o 

38 


extracellular, including cell wall 


39 


endoplasmic reticulum 


41 


endoplasmic reticulum 


59 


endoplasmic reticulum 


70 


nuclear 


71 


nuclear 


72 


nuclear 


78 


nuclear 


98 


, nuclear 


99 


nuclear 


105 


mitochondrial 


108 


endoplasmic reticulum 


1 16 


mitochondrial 


117 


mitochondrial 


134 


nuclear 


135 


nuclear 


137 


mitochondrial 


159 


nuclear 


160 1 


t 

nuclear 


161 


nuclear ! 


1/1 


nuclear 


1 /o 


endoplasmic reticulum 


1 oz 


nuclear 


184 


nuclear 


1 c< 


endoplasmic reticulum 


1 £6 
1 oo 


nuciear 


187 


nuclear 


188 


nuclear 


194 


nuclear 


195 


nuclear 


196 


nuclear 


200 


mitochondrial 


204 


nuclear 


205 


nuclear 


206 


nuclear 



538 



WO 01/42451 



PC I /I B00/0 1938 



211 


nuclear 


212 


nuclear 


213 


nuclear 


214 


endoplasmic reticulum 


215 


endoplasmic reticulum ' 


216 


endoplasmic reticulum 


"» 1 o 

218 


nuclear 


220 


endoplasmic reticulum 


224 


nuclear 


225 


nuclear 


230 


mitochondrial 


231 


mitochondrial 


238 


cytoplasmic 
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Table XII 



Seq Id No in 

priority 
application > 


Internal designation 


Seq Id No in 
present 


1 1 Q 

1 IV 


i i o aao a a /^o r^c 
1 lV-UUJ-4-U-Cz-Co 


1 
1 


OOA 

220 


1 AC A1 1 A T~\0 r^C 


Z ] 


1 4C 

345 


1 05-U 1 o-3-u-U 1 U-CS 


3 


334 


i ac ao*c i a a c r^c 


4 


1 CA 

I 59 


1 AC m 1 1 A A 1 1 /"'O 

I U5 -U3 1 - 1 -U-A 1 1 -LS 


5 


Tin 
2 IV 


1 AC AO 1 O A T~\0 r ~~o 

103-03 l-2-0-D3-Ch 


/C 

o 


250 


1AC AOC O A C 1 

1 05-035 ~2-0-CG-C:> 


7 


217 


1 05-037-2-0-H 1 1 -CS 


8 


340 


t A C ACT A A I^O O 

1 05-053-4-0-E8-CS 


9 


1 15 


1AC AT/4 O A TT1A /"^C 1 

105-0 /4-3-0-H 10-L S 


1 A 

10 


o 1 
3 1 


105-089-3-0-G10-CS 


1 1 

1 1 


198 


1 AC AAC A 1 1 /~*C< 

1 05-095-2 -0-G1 1-CS 


12 


154 


1 A /I AAzT 1 A I""0 y^»0 

1 06-006-1 -0-E3-CS 


13 


366 


1 r\ S~ AO^ 1 A I— A f~* 0 ... 

1 06-03 7- 1 -0-E9-CS.cor 


14 


366 


106-037-l-0-E9-CS.fr 


15 


79 


106-043-4-0-H3-CS 


16 


95 


i i a aat i r\ m /^»o 

1 1 0-007- 1-0-C7-CS 


17 


364 


11/1 1 x~ t A TTO 

1 14-016-1-0-H8-CS 


18 


246 


1 16-004-3-0-A6-CS 


19 


187 


1 1 6-054-3 -0-E6-CS 


OA 

20 


203 


1 1 AC C 1 A A 1 y^C 1 

1 1 6-055- 1-0-A3-CS 


O 1 

21 


298 


i 1 £. acc o a n /"^C 

1 16-055-2-0-r /-CS 


oo 

22 


OTO 

277 


1 1£ AOO /I A A A /^O 

1 16-088-4-0-A9-CS 


oo 

23 


41 


1 "\ C AA 1 1 A T*\C\ /^C 

1 1 6-09 1 - 1 -0-D9-Co 


O/l 

24 


lei 
353 


1 1/ 1 1 A O A EM /"^C 

1 lo-l lU-Z-U-r4-Co 


oc 

25 


/o 


i 1 1 i i 1 a uq r^c 
1 io-i 1 I ~ 1 -U-l iv-Cc> 


0/^ 

zo 


O/l C 

Z45 


1 1 A 111 /I A 13 1 f^C 


07 

Z / 


1 U4 


1 1 < 1 1 C OA etc r^c 
110-1 l>Z-U~ro-Lo 


o y 
zo 


Z5V 


1 1 A i i o o a UK r^Q 


OQ 

zy 


2oy 


1 1 7 AA 1 C A C*a\ (^$1 

1 1 /-UU1 -5-U-kj3-Co 


1A 

3U 


1 DO 


145-Z5-3-U-£S4-Co.COr 


0. 1 

3 1 


j I oo 


1Z1S ?S ^ fi fr 
i i +j-z>?-J-u-r> i +-^ij.ir 


s 


169 


145-56-3-0-D5-CS 


33 | 


312 


145-59-2-0-A7-CS 


34 


\ 273 


157-1 5-4-0-B1 1-CS 


35 


190 


160-I03-1-0-F11-CS 


36 


244 


160-3 7-2 -0-H7-CS 


37 


151 


160-58-3-0-H3-CS 


38 


149 


160-75-4-0-A9-CS 


39 


307 


174-10^2-0-F8-CS 


40 


264 


174-33-3-0-F6-CS 


41 
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168 


174-38-1 -0-B6-CS 


42 


202 


174-38-3-0-C9 CS 


43 


28 


174-39-2-0-A3-CS 


44 


331 


174-4M-0-A6-CS 


45 


258 


174-5-3-0-H7-CS 


46 


84 


174-7-4-0-H1-CS 


47 


294 


175 l-3-0-E5-CS.cor 


48 


294 


175-l-3-0-E5-CS.fr 


49 


3 10 


180-19-4-0-F4-CS 


50 


31 1 


181-10-1-0-DlO-CS 


51 


263 


181-16-1-0-G7-CS 


52 


304 


181-16-2-0-A7-CS 


53 


109 


181-20-3-0-B5-CS 


54 


!21 


181-3-3-0-B8-CS 


55 


181 


181-3-3-0-C9-CS 


56 


191 


182-1-2-0-D12-CS 


57 


193 


184-1-4-0-C11-CS 


58 


192 


1 84-4- 1-0- A 1 1-CS 


59 


1 16 


187-12-4-0-A8-CS 


60 


268 


187-2-2-0-A3-CS 


61 


123 


187-31-0-0-H2-CS 


62 | 


234 


187-34-0 0-11 2-CS 


63 


185 


187-37-0-0-clO-CS 


64 i 


279 


1 87-38-0-04 10-CS 


65 


1 14 


187-39-0-0-kl2-CS 


66 


21 1 


1 87-4 l-0-0-i2 1-CS 


67 


236 


188-1 1-1-0-B3-CS 


68 


35 


188-18-4-0-A9-CS 


69 


299 


1 88-28-4-0-B 1 2-CS.cor 


70 


299 


188-28-4-0-B12-CS.fr 


71 


72 


188-28-4-0-D4-CS 


72 


! 242 


188-41-I-0-B8-CS.cor 


73 


242 


188-41-l-0-B8-CS.fr 


74 


173 


1 88-45- 1-0-D9-CS 


75 


1 06 


188-9-2-0-bl-CS 


76 


130 


105-079-3-0-A1 1-CS 


77 


323 


105-092-1 -0-H7-CS 


78 


160 


105-141-4-0-H9-CS 


79 ! 


272 


109-013-1-0-B9-CS 


80 


ZZo 


1 10-008 -4-0-D9-CS 


81 


333 


1 14-00 1-3-0-A2-CS 


82 


315 


114-028-2-0-C1-CS 


83 


300 


1 14-032-1 -0-H10-CS 


84 


57 


11 4-043 -2-0-A10-CS 


85 


137 


1 14-044- 1-0-C5-CS 


86 


107 


116-003-3-0-D10-CS 


87 
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164 


116-003-3-0-G12-CS 


88 


108 


1 16-01 1-2-0-Fll-CS 


89 


101 


116-033-3-0-E4-CS 


90 


157 


116-041-4-0-B6-CS 


91 


75 


116-044-2-0-C4-CS 


92 


322 


11 6-075- 1-0-E6-CS 


93 


124 


116-094-4-0-G5-CS 


94 


289 


11 7-005-3 -0-F2-CS 


95 


122 


121-007-3-0-D9-CS 


96 


208 


145-91-3-0-D10-CS 


97 


282 


157-1 7-1 -0-F4-CS 


98 


129 


160-1 1-3-0-G8-CS 


99 


317 


1 60-24- 1-0-F12-CS 


100 


308 


160-24-2-0-E9-CS 


101 


25 


160-25-4-0-D2-CS 


102 


243 


160-31 -3 -0-A11-CS 


103 


346 


1 60-32- 1-0-F6-CS 


104 


60 


1 60-37- 1-0-A3-CS 


105 


305 


160-40-3-0-E9-CS 


106 


| 48 


160-58-3-0-E4-CS 


107 


238 


160-85-3-0-D4-CS 


108 


251 


160-95-3-0-A11-CS 


109 | 


196 


162-10-4-0-F9-CS.cor 


110 


196 


162-10-4-0-F9-CS.fr 


111 


347 


174-13-2-0-E4-CS 


112 j 


77 


174-46-2-0-B11-CS 


113 


188 


1 79-8-2 -0-A6-CS 


114 


235 


180-22-3-0-B6-CS 


115 


! 45 


181 13-1-0-F7-CS 


1 16 


265 


181-15-4-0-F7-CS 


117 


280 


181-20-1-0-G7-CS 


118 


281 


184-15-3-0-D1-CS 


119 


39 


1 87-1 2-2 -0-G11-CS 


120 


165 


187-2-2-0-A12-CS 


121 


326 


187-30-0-0-k23-CS 


122 


330 


187-36-0-0-el9-CS 


123 


368 


187-38-0-0-d22-CS 


124 


71 


187-39~0-0-b9-CS 


125 


| 224 


187-39-0-0-g6-CS 


126 


| 90 


1 87-45-0-0-1 18-CS 


127 


216 


187-45-0-0-m21-CS 


128 


83 


187_45_0-0-n8-CS 


129 


342 


187~46-0-0-f23-CS 


130 


262 


187-5-1-0-A12-CS 


131 j 


257 


1 87-5-1 -0-F6-CS 


132 


293 


187-5-2-0-B2-CS 


133 



542 



WO 01/42451 



PCT/I B00/0 1 938 



231 


187-5-3-0-D5-CS 


134 


287 


187-51-0-0-f9-CS 


135 


325 


187-6-1-0-B9-CS 


136 


309 


187-6-4-0-C10-CS 


137 


359 


188-19-2-0-C8-CS 


138 


68 


188-22-4-0-G6-CS 


139 


233 


188-28-4-0-D11-CS 


140 


369 


1 88-29- 1-0-ElO-CS 


141 


155 


1 88-34-4 -0-E5-CS 


142 


327 


188-9-3-0-A5-CS 


143 


283 


1 05-02 1-3-0-C3-CS 


144 


29 


105-037-4-0-H12-CS 


145 


100 


1 05-073-2 -0-A7-CS 


146 


99 


109-002-4-0-C6-CS 


147 


360 


1 09-003- 1-0-G4-CS 


148 


321 
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1 . An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
encoding: 

5 i) a polypeptide comprising an ammo acid sequence having at least 

about 80% identity to any one of the sequences shown as SEQ ID 
NOs:242-482 or any one of the sequences of polypeptides encoded 
by the clone inserts of the deposited clone pool; or 
ii) a biologically active fragment of said polypeptide. 

10 

2. The polynucleotide of claim 1 , wherein said polypeptide comprises any one of the 
sequences shown as SEQ ID NOs:242-482 or any one of the sequences of the polypeptides encoded 
by the clone inserts of the deposited clone pool. 

15 3. The polynucleotide of claim 1, wherein said polypeptide comprises a signal peptide. 

4. The polynucleotide of claim 1 , wherein said polypeptide is a mature protein. 

5. The polynucleotide of claim 1 , wherein said nucleic acid sequence has at least about 
20 80% identity over at least about 100 contiguous nucleotides to any one of the sequences shown as 

SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

6. The polynucleotide of claim 1 , wherein said polynucleotide hybridizes under 
stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 
NOs: 1 -241 or any one of the sequences of the clone inserts of the deposited clone pool. 

7. The polynucleotide of claim 5, wherein said nucleic acid sequence comprises any 
one of the sequences shown as SEQ ID NOs: 1 -241 or any one the sequences of the clone inserts of 
the deposited clone pool. 

8. The polynucleotide of claim 1 , wherein said polynucleotide is operably linked to a 
promoter. 

9. An expression vector comprising the polynucleotide of claim 8. 

10. A host cell recombinant for the polynucleotide of claim 1. 
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11. A non-human transgenic animal comprising the host cell of claim 10. 



12. A method of making a GEN SET polypeptide, said method comprising 

a) providing a population of host cells comprising the polynucleotide of 
5 claim 8; and 

b) culturing said population of host cells under conditions conducive to the 
production of said polypeptide within said host cells. 

13. The method of claim 12, further comprising purifying said polypeptide from said 
10 population of host cells. 

14. A method of making a GENSET polypeptide, said method comprising 

a) providing a population of cells comprising the polynucleotide of claim 
8; 

15 b) culturing said population of cells under conditions conducive to the 

production of said polypeptide within said cells; and 
c) purifying said polypeptide from said population of cells. 

15. An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
20 having at least about 80% identity over at least about 100 contiguous nucleotides to any one of the 

sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 
deposited clone pool. 

16. The polynucleotide of claim 15, wherein said polynucleotide hybridizes under 
25 stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 

NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

17. The polynucleotide of claim 1 5, wherein said polynucleotide comprises any one of 
the sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 

30 deposited clone pool. 

1 8. A biologically active polypeptide encoded by the polynucleotide of claim 15. 

1 9. An isolated polypeptide or biologically active fragment thereof, said polypeptide 

35 comprising an amino acid sequence having at least about 80% sequence identity to any one of the 
sequences shown as SEQ ID NOs:242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 
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20. The polypeptide of claim 19, wherein said polypeptide is selectively recognized by 
an antibody raised agamst an antigenic polypeptide, or an antigenic fragment thereof, said antigenic 
polypeptide comprising any one of the sequences shown as SEQ ID NOs:242-482 or any one of the 

5 sequences of polypeptides encoded by the clone inserts of the deposited clone pool. 

2 1 . The polypeptide of claim 19, wherein said polypeptide comprises any one of the 
sequences shown as SEQ ID NOs: 242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 

10 

22. The polypeptide of claim 19, wherein said polypeptide comprises a signal peptide. 

23. The polypeptide of claim 19, wherein said polypeptide is a mature protein. 

15 24. An antibody that specifically binds to the polypeptide of claim 19. 

25. A method of determining whether a GENSET gene is expressed within a mammal, 
said method comprising the steps of: 

a) providing a biological sample from said mammal 
20 b) contacting said biological sample with either of: 

i) a polynucleotide that hybridizes under stringent conditions to the 
polynucleotide of claim 1; or 

ii) a polypeptide that specifically binds to the polypeptide of claim 19; and 
c) detecting the presence or absence of hybridization between said polynucleotide 

25 and an RNA species within said sample, or the presence or absence of binding 

of said polypeptide to a protein within said sample; 
wherein a detection of said hybridization or of said binding indicates that said GENSET gene is 
expressed within said mammal. 

30 26. The method of claim 25, wherein said polynucleotide is a primer, and wherein said 

hybridization is detected by detecting the presence of an amplification product comprising the 
sequence of said primer. 
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27. The method of claim 25, wherein said polypeptide is an antibody. 

28. A method of determining whether a mammal has an elevated or reduced level of 
GENSET gene expression, said method comprising the steps of : 
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a) providing a biological sample from said mammal; and 

b) comparing the amount of the polypeptide of claim 19, or of an RNA species 
encoding said polypeptide, within said biological sample with a level 
detected in or expected from a control sample; 

5 wherein an increased amount of said polypeptide or said RNA species within said biological 

sample compared to said level detected in or expected from said control sample indicates that said 
mammal has an elevated level of said GENSET gene expression, and wherein a decreased amount 
of said polypeptide or said RNA species within said biological sample compared to said level 
detected in or expected from said control sample indicates that said mammal has a reduced level of 
1 0 said GENSET gene expression. 



29. A method of identifying a candidate modulator of a GENSET polypeptide, said 
method comprising : 

a) contacting the polypeptide of claim 18 with a test compound; and 
1 5 b) determining whether said compound specifically binds to said 

polypeptide; 

wherein a detection that said compound specifically binds to said polypeptide indicates that 
said compound is a candidate modulator of said GENSET polypeptide. 
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